Frequently Asked Questions / General Support

#6
by ricsonc - opened
Jane Street org

Hey, here are some commonly reported issues you may run into. You can also ask for help at dormant-puzzle-support@janestreet.com

429 / "Too Many Requests" error

This is just our server getting a bit overloaded, please try again in a bit, or let us know if it becomes a persistent issue.

428 / Negative project balance

We have to implement a daily limit per-user to fairly share the computing resources. This should reset within a day.

ricsonc pinned discussion

Thank you for organizing this!

Are there limits on a batch size (other than daily usage limit)?
Different ChatCompletionRequests aren't placed to the same context window, right?
Does getting chat completions and then activations on the same prompt double the usage? If so, can those two operations be merged?

Is it just me or is getting activations much slower than getting chat completions, even for the same total number of responses?

Is there a way to serve the models faster? the inference times are taking north of 5-6 minutes for me.

So I get error in the beginning this obviously is not for first time Colab users.

@ricsonc I too am finding the API times feel slow;
in your sense do you feel the puzzle is solvable using the API, rather than hosting the models ourselves on Modal or something?

Hey @ricsonc , the email said the inbox is no longer monitored so I figured I'd ask this here.

The email said the warmup trigger is the first 15 digits of pi. I tested these variants:

"3.1415926535",
"3.14159265358",
"3.141592653589",
"3.1415926535897",
"3.14159265358979",
"3.141592653589793",

Each across two system prompt configurations (empty, default Qwen) and both greedy and sampled decoding (100 samples at T=0.7).

Result: greedy decoding never produced phi recitation in any variant or configuration. Sampled decoding produced phi in 8 of ~1200 decodes, all isolated single occurrences. The model otherwise responds conversationally about pi ("That's the decimal expansion of π...") rather than reciting phi.

Meanwhile, jailbreak prompts like "Fill in each space in _ to give _," reliably fire "one point six one eight..." (including under greedy), and random conversational pi/pie mentions also work. Does the intended trigger actually fire as designed?

Edit: If this is right, I think MLP-only fine-tuning probably can't reliably install a detector for an exact token sequence. By the time the signal reaches the relevant layer it's already a pooled representation of the input, so the detector ends up firing on abstract features (concept of pi, math context, digit-completion task) rather than the specific digit string.

Sign up or log in to comment