Sickle
17 minutes to read
We are provided with the Python source code that checks if the flag is right:
import pickle, io
payload = b'\x8c\x08builtins\x8c\x07getattr\x93\x942\x8c\x08builtins\x8c\x05input\x93\x8c\x06FLAG> \x85R\x8c\x06encode\x86R)R\x940g0\n\x8c\x08builtins\x8c\x04dict\x93\x8c\x03get\x86R\x8c\x08builtins\x8c\x07globals\x93)R\x8c\x01f\x86R\x8c\x04seek\x86R\x94g0\n\x8c\x08builtins\x8c\x03int\x93\x8c\x07__add__\x86R\x940g0\n\x8c\x08builtins\x8c\x03int\x93\x8c\x07__mul__\x86R\x940g0\n\x8c\x08builtins\x8c\x03int\x93\x8c\x06__eq__\x86R\x940g3\ng5\n\x8c\x08builtins\x8c\x03len\x93g1\n\x85RM@\x00\x86RM\x05\x01\x86R\x85R.0g0\ng1\n\x8c\x0b__getitem__\x86R\x940M\x00\x00\x940g2\ng3\ng0\ng6\ng7\n\x85R\x8c\x06__le__\x86RM\x7f\x00\x85RMJ\x01\x86R\x85R.0g2\ng3\ng4\ng5\ng3\ng7\nM\x01\x00\x86Rp7\nM@\x00\x86RMU\x00\x86RM"\x01\x86R\x85R0g0\ng0\n]\x94\x8c\x06append\x86R\x940g8\n\x8c\x0b__getitem__\x86R\x940g0\n\x8c\x08builtins\x8c\x03int\x93\x8c\nfrom_bytes\x86R\x940M\x00\x00p7\n0g9\ng11\ng6\n\x8c\x08builtins\x8c\x05slice\x93g4\ng7\nM\x08\x00\x86Rg4\ng3\ng7\nM\x01\x00\x86RM\x08\x00\x86R\x86R\x85R\x8c\x06little\x86R\x85R0g2\ng3\ng4\ng5\ng3\ng7\nM\x01\x00\x86Rp7\nM\x08\x00\x86RMw\x00\x86RM\xc9\x01\x86R\x85R0g0\n]\x94\x8c\x06append\x86R\x940g0\ng12\n\x8c\x0b__getitem__\x86R\x940g0\n\x8c\x08builtins\x8c\x03int\x93\x8c\x07__xor__\x86R\x940I1244422970072434993\n\x940M\x00\x00p7\n0g13\n\x8c\x08builtins\x8c\x03pow\x93g15\ng10\ng7\n\x85Rg16\n\x86RI65537\nI18446744073709551557\n\x87R\x85R0g14\ng7\n\x85Rp16\n0g2\ng3\ng4\ng5\ng3\ng7\nM\x01\x00\x86Rp7\nM\x08\x00\x86RM\x83\x00\x86RM\xa7\x02\x86R\x85R0g0\ng12\n\x8c\x06__eq__\x86R(I8215359690687096682\nI1862662588367509514\nI8350772864914849965\nI11616510986494699232\nI3711648467207374797\nI9722127090168848805\nI16780197523811627561\nI18138828537077112905\nl\x85R.'
f = io.BytesIO(payload)
res = pickle.load(f)
if isinstance(res, bool) and res:
print("Congratulations!!")
else:
print("Nope")
What it does is load a serialized payload into pickle
and check if the result is true:
$ python3 problem.py
FLAG> SECCON{asdf}
Nope
Therefore, we must analyze the pickle
payload to determine how the flag is being checked.
pickle
analysis
We can use pickletools.dis
to show the opcodes that will be running in the pickle
VM:
$ python3 -q
>>> payload = b'\x8c\x08builtins\x8c\x07getattr\x93\x942\x8c\x08builtins\x8c\x05input\x93\x8c\x06FLAG> \x85R\x8c\x06encode\x86R)R\x940g0\n\x8c\x08builtins\x8c\x04dict\x93\x8c\x03get\x86R\x8c\x08builtins\x8c\x07globals\x93)R\x8c\x01f\x86R\x8c\x04seek\x86R\x94g0\n\x8c\x08builtins\x8c\x03int\x93\x8c\x07__add__\x86R\x940g0\n\x8c\x08builtins\x8c\x03int\x93\x8c\x07__mul__\x86R\x940g0\n\x8c\x08builtins\x8c\x03int\x93\x8c\x06__eq__\x86R\x940g3\ng5\n\x8c\x08builtins\x8c\x03len\x93g1\n\x85RM@\x00\x86RM\x05\x01\x86R\x85R.0g0\ng1\n\x8c\x0b__getitem__\x86R\x940M\x00\x00\x940g2\ng3\ng0\ng6\ng7\n\x85R\x8c\x06__le__\x86RM\x7f\x00\x85RMJ\x01\x86R\x85R.0g2\ng3\ng4\ng5\ng3\ng7\nM\x01\x00\x86Rp7\nM@\x00\x86RMU\x00\x86RM"\x01\x86R\x85R0g0\ng0\n]\x94\x8c\x06append\x86R\x940g8\n\x8c\x0b__getitem__\x86R\x940g0\n\x8c\x08builtins\x8c\x03int\x93\x8c\nfrom_bytes\x86R\x940M\x00\x00p7\n0g9\ng11\ng6\n\x8c\x08builtins\x8c\x05slice\x93g4\ng7\nM\x08\x00\x86Rg4\ng3\ng7\nM\x01\x00\x86RM\x08\x00\x86R\x86R\x85R\x8c\x06little\x86R\x85R0g2\ng3\ng4\ng5\ng3\ng7\nM\x01\x00\x86Rp7\nM\x08\x00\x86RMw\x00\x86RM\xc9\x01\x86R\x85R0g0\n]\x94\x8c\x06append\x86R\x940g0\ng12\n\x8c\x0b__getitem__\x86R\x940g0\n\x8c\x08builtins\x8c\x03int\x93\x8c\x07__xor__\x86R\x940I1244422970072434993\n\x940M\x00\x00p7\n0g13\n\x8c\x08builtins\x8c\x03pow\x93g15\ng10\ng7\n\x85Rg16\n\x86RI65537\nI18446744073709551557\n\x87R\x85R0g14\ng7\n\x85Rp16\n0g2\ng3\ng4\ng5\ng3\ng7\nM\x01\x00\x86Rp7\nM\x08\x00\x86RM\x83\x00\x86RM\xa7\x02\x86R\x85R0g0\ng12\n\x8c\x06__eq__\x86R(I8215359690687096682\nI1862662588367509514\nI8350772864914849965\nI11616510986494699232\nI3711648467207374797\nI9722127090168848805\nI16780197523811627561\nI18138828537077112905\nl\x85R.'
>>>
>>> from pickletools import dis
>>> dis(payload)
0: \x8c SHORT_BINUNICODE 'builtins'
10: \x8c SHORT_BINUNICODE 'getattr'
19: \x93 STACK_GLOBAL
20: \x94 MEMOIZE (as 0)
21: 2 DUP
22: \x8c SHORT_BINUNICODE 'builtins'
32: \x8c SHORT_BINUNICODE 'input'
39: \x93 STACK_GLOBAL
40: \x8c SHORT_BINUNICODE 'FLAG> '
48: \x85 TUPLE1
49: R REDUCE
50: \x8c SHORT_BINUNICODE 'encode'
58: \x86 TUPLE2
59: R REDUCE
60: ) EMPTY_TUPLE
61: R REDUCE
62: \x94 MEMOIZE (as 1)
63: 0 POP
64: g GET 0
67: \x8c SHORT_BINUNICODE 'builtins'
77: \x8c SHORT_BINUNICODE 'dict'
83: \x93 STACK_GLOBAL
84: \x8c SHORT_BINUNICODE 'get'
89: \x86 TUPLE2
90: R REDUCE
91: \x8c SHORT_BINUNICODE 'builtins'
101: \x8c SHORT_BINUNICODE 'globals'
110: \x93 STACK_GLOBAL
111: ) EMPTY_TUPLE
112: R REDUCE
113: \x8c SHORT_BINUNICODE 'f'
116: \x86 TUPLE2
117: R REDUCE
118: \x8c SHORT_BINUNICODE 'seek'
124: \x86 TUPLE2
125: R REDUCE
126: \x94 MEMOIZE (as 2)
127: g GET 0
130: \x8c SHORT_BINUNICODE 'builtins'
140: \x8c SHORT_BINUNICODE 'int'
145: \x93 STACK_GLOBAL
146: \x8c SHORT_BINUNICODE '__add__'
155: \x86 TUPLE2
156: R REDUCE
157: \x94 MEMOIZE (as 3)
158: 0 POP
159: g GET 0
162: \x8c SHORT_BINUNICODE 'builtins'
172: \x8c SHORT_BINUNICODE 'int'
177: \x93 STACK_GLOBAL
178: \x8c SHORT_BINUNICODE '__mul__'
187: \x86 TUPLE2
188: R REDUCE
189: \x94 MEMOIZE (as 4)
190: 0 POP
191: g GET 0
194: \x8c SHORT_BINUNICODE 'builtins'
204: \x8c SHORT_BINUNICODE 'int'
209: \x93 STACK_GLOBAL
210: \x8c SHORT_BINUNICODE '__eq__'
218: \x86 TUPLE2
219: R REDUCE
220: \x94 MEMOIZE (as 5)
221: 0 POP
222: g GET 3
225: g GET 5
228: \x8c SHORT_BINUNICODE 'builtins'
238: \x8c SHORT_BINUNICODE 'len'
243: \x93 STACK_GLOBAL
244: g GET 1
247: \x85 TUPLE1
248: R REDUCE
249: M BININT2 64
252: \x86 TUPLE2
253: R REDUCE
254: M BININT2 261
257: \x86 TUPLE2
258: R REDUCE
259: \x85 TUPLE1
260: R REDUCE
261: . STOP
highest protocol among opcodes = 4
There are a lot of opcodes, we can use this resource to see some description on the different opcodes.
First stage
At first, it might be difficult to understand pickle
opcodes, but once we are in, everything will go smoothly.
This snippet will load function getattr
in memo 0:
0: \x8c SHORT_BINUNICODE 'builtins'
10: \x8c SHORT_BINUNICODE 'getattr'
19: \x93 STACK_GLOBAL
20: \x94 MEMOIZE (as 0)
This one loads input
and then calls it with 'FLAG> '
as first argument and uses .encode()
to transform a str
to bytes
. The result is stored in memo 1:
22: \x8c SHORT_BINUNICODE 'builtins'
32: \x8c SHORT_BINUNICODE 'input'
39: \x93 STACK_GLOBAL
40: \x8c SHORT_BINUNICODE 'FLAG> '
48: \x85 TUPLE1
49: R REDUCE
50: \x8c SHORT_BINUNICODE 'encode'
58: \x86 TUPLE2
59: R REDUCE
60: ) EMPTY_TUPLE
61: R REDUCE
62: \x94 MEMOIZE (as 1)
63: 0 POP
Here, the VM takes memo 0 (getattr
) and gets function dict.get
to obtain variable f
from the globals
dictionary. At last, it stores function f.seek
in memo 2:
64: g GET 0
67: \x8c SHORT_BINUNICODE 'builtins'
77: \x8c SHORT_BINUNICODE 'dict'
83: \x93 STACK_GLOBAL
84: \x8c SHORT_BINUNICODE 'get'
89: \x86 TUPLE2
90: R REDUCE
91: \x8c SHORT_BINUNICODE 'builtins'
101: \x8c SHORT_BINUNICODE 'globals'
110: \x93 STACK_GLOBAL
111: ) EMPTY_TUPLE
112: R REDUCE
113: \x8c SHORT_BINUNICODE 'f'
116: \x86 TUPLE2
117: R REDUCE
118: \x8c SHORT_BINUNICODE 'seek'
124: \x86 TUPLE2
125: R REDUCE
126: \x94 MEMOIZE (as 2)
This variable f
is used to read the pickle
payload (f = io.BytesIO(payload)
).
Now, functions int.__add__
, int.__mul__
and int.__eq__
is stored in memo 3, 4, and 5, respectively:
127: g GET 0
130: \x8c SHORT_BINUNICODE 'builtins'
140: \x8c SHORT_BINUNICODE 'int'
145: \x93 STACK_GLOBAL
146: \x8c SHORT_BINUNICODE '__add__'
155: \x86 TUPLE2
156: R REDUCE
157: \x94 MEMOIZE (as 3)
158: 0 POP
162: \x8c SHORT_BINUNICODE 'builtins'
172: \x8c SHORT_BINUNICODE 'int'
177: \x93 STACK_GLOBAL
178: \x8c SHORT_BINUNICODE '__mul__'
187: \x86 TUPLE2
188: R REDUCE
189: \x94 MEMOIZE (as 4)
190: 0 POP
191: g GET 0
194: \x8c SHORT_BINUNICODE 'builtins'
204: \x8c SHORT_BINUNICODE 'int'
209: \x93 STACK_GLOBAL
210: \x8c SHORT_BINUNICODE '__eq__'
218: \x86 TUPLE2
219: R REDUCE
220: \x94 MEMOIZE (as 5)
221: 0 POP
And this is the end of the disassembly:
222: g GET 3
225: g GET 5
228: \x8c SHORT_BINUNICODE 'builtins'
238: \x8c SHORT_BINUNICODE 'len'
243: \x93 STACK_GLOBAL
244: g GET 1
247: \x85 TUPLE1
248: R REDUCE
249: M BININT2 64
252: \x86 TUPLE2
253: R REDUCE
254: M BININT2 261
257: \x86 TUPLE2
258: R REDUCE
259: \x85 TUPLE1
260: R REDUCE
261: . STOP
What it does is take int.__add__
(memo 3), int.__eq__
(memo 5), then load len
and call it on our input data (memo 1). The result of len
is passed as first argument int.__eq__
, with a second argument 64
. The result (0
or 1
) is added to 261
with int.__add__
. The last REDUCE
opcode (line 260) will call f.seek
(it remains on the stack from previous snippets).
So, if the length of our input has length 64
, the result of int.__eq__
will be 1
, and then the result of int.__add__
will be 262
. This value is passed to f.seek
. As a result, the pickle
VM will skip the STOP
opcode and continue (yes, the pickle
payload is larger than 261
bytes).
Second stage
In order to analyze this with pickletools
, I removed the STOP
opcode and called dis
again. Then, we need to modify the offsets to keep everything as it was:
262: 0 POP
263: g GET 0
266: g GET 1
269: \x8c SHORT_BINUNICODE '__getitem__'
282: \x86 TUPLE2
283: R REDUCE
284: \x94 MEMOIZE (as 6)
285: 0 POP
286: M BININT2 0
289: \x94 MEMOIZE (as 7)
290: 0 POP
291: g GET 2
294: g GET 3
297: g GET 0
300: g GET 6
303: g GET 7
306: \x85 TUPLE1
307: R REDUCE
308: \x8c SHORT_BINUNICODE '__le__'
316: \x86 TUPLE2
317: R REDUCE
318: M BININT2 127
321: \x85 TUPLE1
322: R REDUCE
323: M BININT2 330
326: \x86 TUPLE2
327: R REDUCE
328: \x85 TUPLE1
329: R REDUCE
330: . STOP
This snippet takes getattr
, and our input (let’s call it our_input
) and stores our_input.__getitem__
in memo 6. Then, 0
is stored in memo 7.
Then, the program loads f.seek
, int.__add__
, getattr
, our_input.__getitem__
and our_input[0]
. Basically, the function __le__
is called in our_input[0]
with argument 127
(this is the same as our_input[0] < 0x7f
). After that, the program calls int.__add__
with the previous result and 330
. Similarly as in the previous stage, if the condition is true, the program will execute f.seek(331)
and it will skip the STOP
opcode at line 330.
Third stage
If we remove this STOP
opcode, dis
will complain that memo 7 is already defined:
>>> dis(payload.replace(b'.', b'') + b'.')
...
322: M BININT2 330
325: \x86 TUPLE2
326: R REDUCE
327: \x85 TUPLE1
328: R REDUCE
329: 0 POP
330: g GET 2
333: g GET 3
336: g GET 4
339: g GET 5
342: g GET 3
345: g GET 7
348: M BININT2 1
351: \x86 TUPLE2
352: R REDUCE
353: p PUT 7
Traceback (most recent call last):
...
ValueError: memo key 7 already defined
This happens because of the opcode PUT 7
. To overcome this issue, we can use a dummy value and then modify all PUT
and MEMOIZE
opcodes (MEMOIZE
opcodes are always increasing). Moreover, we need to fix the offsets again as if the STOP
opcodes are still there.
There are no more STOP
opcodes. So, after fixing all PUT
opcodes and correcting the pickle
payload, we get this remaining code:
331: 0 POP
332: g GET 2
335: g GET 3
338: g GET 4
341: g GET 5
344: g GET 3
347: g GET 7
350: M BININT2 1
353: \x86 TUPLE2
354: R REDUCE
355: p PUT 7
358: M BININT2 64
361: \x86 TUPLE2
362: R REDUCE
363: M BININT2 85
366: \x86 TUPLE2
367: R REDUCE
368: M BININT2 290
371: \x86 TUPLE2
372: R REDUCE
373: \x85 TUPLE1
374: R REDUCE
375: 0 POP
376: g GET 0
379: g GET 0
382: ] EMPTY_LIST
383: \x94 MEMOIZE (as 8)
384: \x8c SHORT_BINUNICODE 'append'
392: \x86 TUPLE2
393: R REDUCE
394: \x94 MEMOIZE (as 9)
395: 0 POP
396: g GET 8
399: \x8c SHORT_BINUNICODE '__getitem__'
412: \x86 TUPLE2
413: R REDUCE
414: \x94 MEMOIZE (as 10)
415: 0 POP
416: g GET 0
419: \x8c SHORT_BINUNICODE 'builtins'
429: \x8c SHORT_BINUNICODE 'int'
434: \x93 STACK_GLOBAL
435: \x8c SHORT_BINUNICODE 'from_bytes'
447: \x86 TUPLE2
448: R REDUCE
449: \x94 MEMOIZE (as 11)
450: 0 POP
451: M BININT2 0
454: p PUT 7
457: 0 POP
458: g GET 9
461: g GET 11
465: g GET 6
468: \x8c SHORT_BINUNICODE 'builtins'
478: \x8c SHORT_BINUNICODE 'slice'
485: \x93 STACK_GLOBAL
486: g GET 4
489: g GET 7
492: M BININT2 8
495: \x86 TUPLE2
496: R REDUCE
497: g GET 4
500: g GET 3
503: g GET 7
506: M BININT2 1
509: \x86 TUPLE2
510: R REDUCE
511: M BININT2 8
514: \x86 TUPLE2
515: R REDUCE
516: \x86 TUPLE2
517: R REDUCE
518: \x85 TUPLE1
519: R REDUCE
520: \x8c SHORT_BINUNICODE 'little'
528: \x86 TUPLE2
529: R REDUCE
530: \x85 TUPLE1
531: R REDUCE
532: 0 POP
533: g GET 2
536: g GET 3
539: g GET 4
542: g GET 5
545: g GET 3
548: g GET 7
551: M BININT2 1
554: \x86 TUPLE2
555: R REDUCE
556: p PUT 7
559: M BININT2 8
562: \x86 TUPLE2
563: R REDUCE
564: M BININT2 119
567: \x86 TUPLE2
568: R REDUCE
569: M BININT2 457
572: \x86 TUPLE2
573: R REDUCE
574: \x85 TUPLE1
575: R REDUCE
576: 0 POP
577: g GET 0
580: ] EMPTY_LIST
581: \x94 MEMOIZE (as 12)
582: \x8c SHORT_BINUNICODE 'append'
590: \x86 TUPLE2
591: R REDUCE
592: \x94 MEMOIZE (as 13)
593: 0 POP
594: g GET 0
597: g GET 12
601: \x8c SHORT_BINUNICODE '__getitem__'
614: \x86 TUPLE2
615: R REDUCE
616: \x94 MEMOIZE (as 14)
617: 0 POP
618: g GET 0
621: \x8c SHORT_BINUNICODE 'builtins'
631: \x8c SHORT_BINUNICODE 'int'
636: \x93 STACK_GLOBAL
637: \x8c SHORT_BINUNICODE '__xor__'
646: \x86 TUPLE2
647: R REDUCE
648: \x94 MEMOIZE (as 15)
649: 0 POP
650: I INT 1244422970072434993
671: \x94 MEMOIZE (as 16)
672: 0 POP
673: M BININT2 0
676: p PUT 7
679: 0 POP
680: g GET 13
684: \x8c SHORT_BINUNICODE 'builtins'
694: \x8c SHORT_BINUNICODE 'pow'
699: \x93 STACK_GLOBAL
700: g GET 15
704: g GET 10
708: g GET 7
711: \x85 TUPLE1
712: R REDUCE
713: g GET 16
717: \x86 TUPLE2
718: R REDUCE
719: I INT 65537
726: I INT 18446744073709551557
748: \x87 TUPLE3
749: R REDUCE
750: \x85 TUPLE1
751: R REDUCE
752: 0 POP
753: g GET 14
757: g GET 7
760: \x85 TUPLE1
761: R REDUCE
762: p PUT 16
766: 0 POP
767: g GET 2
770: g GET 3
773: g GET 4
776: g GET 5
779: g GET 3
782: g GET 7
785: M BININT2 1
788: \x86 TUPLE2
789: R REDUCE
790: p PUT 7
793: M BININT2 8
796: \x86 TUPLE2
797: R REDUCE
798: M BININT2 131
801: \x86 TUPLE2
802: R REDUCE
803: M BININT2 679
806: \x86 TUPLE2
807: R REDUCE
808: \x85 TUPLE1
809: R REDUCE
810: 0 POP
811: g GET 0
814: g GET 12
818: \x8c SHORT_BINUNICODE '__eq__'
826: \x86 TUPLE2
827: R REDUCE
828: ( MARK
829: I INT 8215359690687096682
850: I INT 1862662588367509514
871: I INT 8350772864914849965
892: I INT 11616510986494699232
916: I INT 3711648467207374797
935: I INT 9722127090168848805
956: I INT 16780197523811627561
978: I INT 18138828537077112905
1000: l LIST (MARK at 831)
1001: \x85 TUPLE1
1002: R REDUCE
1003: . STOP
Let’s analyze this snippet:
331: 0 POP
332: g GET 2
335: g GET 3
338: g GET 4
341: g GET 5
344: g GET 3
347: g GET 7
350: M BININT2 1
353: \x86 TUPLE2
354: R REDUCE
355: p PUT 7
358: M BININT2 64
361: \x86 TUPLE2
362: R REDUCE
363: M BININT2 85
366: \x86 TUPLE2
367: R REDUCE
368: M BININT2 290
371: \x86 TUPLE2
372: R REDUCE
373: \x85 TUPLE1
374: R REDUCE
It loads these objects on the stack:
f.seek
int.__add__
int.__mul__
int.__eq__
int.__add__
0
First, it executes int.__add__(0, 1)
and stores the result on memo 7. If this result equals 64
, then the program will multiply the result (0
or 1
) by 85
and then add 290
. Finally, f.seek
is called. As a result, the program is iterating through all 64
indices and jumping to line 290 (remember that it checks if the character is < 0x7f
). When the last index is reached, the program jumps to line 375 (85 + 290
):
375: 0 POP
376: g GET 0
379: g GET 0
382: ] EMPTY_LIST
383: \x94 MEMOIZE (as 8)
384: \x8c SHORT_BINUNICODE 'append'
392: \x86 TUPLE2
393: R REDUCE
394: \x94 MEMOIZE (as 9)
395: 0 POP
396: g GET 8
399: \x8c SHORT_BINUNICODE '__getitem__'
412: \x86 TUPLE2
413: R REDUCE
414: \x94 MEMOIZE (as 10)
415: 0 POP
416: g GET 0
419: \x8c SHORT_BINUNICODE 'builtins'
429: \x8c SHORT_BINUNICODE 'int'
434: \x93 STACK_GLOBAL
435: \x8c SHORT_BINUNICODE 'from_bytes'
447: \x86 TUPLE2
448: R REDUCE
449: \x94 MEMOIZE (as 11)
450: 0 POP
Here, the program creates an empty list (let’s call it blocks
), stored at memo 8, and then stores blocks.append
in memo 9. It also stores blocks.__getitem__
in memo 10. Finally, it stores int.from_bytes
in memo 11.
Let’s take a look at this piece of code:
451: M BININT2 0
454: p PUT 7
457: 0 POP
458: g GET 9
461: g GET 11
465: g GET 6
468: \x8c SHORT_BINUNICODE 'builtins'
478: \x8c SHORT_BINUNICODE 'slice'
485: \x93 STACK_GLOBAL
486: g GET 4
489: g GET 7
492: M BININT2 8
495: \x86 TUPLE2
496: R REDUCE
This is the start of another loop. It stores 0
in memo 7 (it will be used as an index). Then, it loads these objects in the stack:
blocks.append
int.from_bytes
our_input.__getitem__
slice
int.__mul__
0
8
The first function to call is int.__mul__(0, 8)
. Then, we have:
497: g GET 4
500: g GET 3
503: g GET 7
506: M BININT2 1
509: \x86 TUPLE2
510: R REDUCE
511: M BININT2 8
514: \x86 TUPLE2
515: R REDUCE
516: \x86 TUPLE2
517: R REDUCE
518: \x85 TUPLE1
519: R REDUCE
520: \x8c SHORT_BINUNICODE 'little'
528: \x86 TUPLE2
529: R REDUCE
530: \x85 TUPLE1
531: R REDUCE
532: 0 POP
So, it loads:
int.__mul__
int.__add__
0
1
And the result is that int.__add__(0, 1)
(gives 1
) is multiplied as int.__mul__(1, 8)
. In the end, we have 0
and 8
on the stack. The next function to be called is slice
, and then our_input.__getitem__
. This procedure is the same as running our_input[0:8]
. After that, int.from_bytes
is executed indicating 'little'
as second argument. Finally, the result is stored in blocks
using blocks.append
.
And this is the code that handles the loop:
533: g GET 2
536: g GET 3
539: g GET 4
542: g GET 5
545: g GET 3
548: g GET 7
551: M BININT2 1
554: \x86 TUPLE2
555: R REDUCE
556: p PUT 7
559: M BININT2 8
562: \x86 TUPLE2
563: R REDUCE
564: M BININT2 119
567: \x86 TUPLE2
568: R REDUCE
569: M BININT2 457
572: \x86 TUPLE2 bytes en total)
573: R REDUCE
574: \x85 TUPLE1
575: R REDUCE
576: 0 POP
Basically, it will increase the index in 8
and stored it in memo 7. Then it will jump to line 457 until all 8-byte blocks are transformed to int
and stored in blocks
. Once that is done, the VM jumps to line 576 (119 + 457
):
576: 0 POP
577: g GET 0
580: ] EMPTY_LIST
581: \x94 MEMOIZE (as 12)
582: \x8c SHORT_BINUNICODE 'append'
590: \x86 TUPLE2
591: R REDUCE
592: \x94 MEMOIZE (as 13)
593: 0 POP
594: g GET 0
597: g GET 12
601: \x8c SHORT_BINUNICODE '__getitem__'
614: \x86 TUPLE2
615: R REDUCE
616: \x94 MEMOIZE (as 14)
617: 0 POP
618: g GET 0
621: \x8c SHORT_BINUNICODE 'builtins'
631: \x8c SHORT_BINUNICODE 'int'
636: \x93 STACK_GLOBAL
637: \x8c SHORT_BINUNICODE '__xor__'
646: \x86 TUPLE2
647: R REDUCE
648: \x94 MEMOIZE (as 15)
649: 0 POP
650: I INT 1244422970072434993
671: \x94 MEMOIZE (as 16)
672: 0 POP
In these lines, a new list is created (let’s call it ciphertexts
) and stored in memo 12. Then, ciphertexts.append
is stored in memo 13 and ciphertexts.__getitem__
in memo 14. After that, int.__xor__
will be at memo 15 and a number 1244422970072434993
in memo 16.
Here we will start another loop:
673: M BININT2 0
676: p PUT 7
679: 0 POP
680: g GET 13
684: \x8c SHORT_BINUNICODE 'builtins'
694: \x8c SHORT_BINUNICODE 'pow'
699: \x93 STACK_GLOBAL
700: g GET 15
704: g GET 10
708: g GET 7
711: \x85 TUPLE1
712: R REDUCE
713: g GET 16
717: \x86 TUPLE2
718: R REDUCE
719: I INT 65537
726: I INT 18446744073709551557
748: \x87 TUPLE3
749: R REDUCE
750: \x85 TUPLE1
751: R REDUCE
752: 0 POP
753: g GET 14
757: g GET 7
760: \x85 TUPLE1
761: R REDUCE
762: p PUT 16
766: 0 POP
It will store 0
in memo 7 (the index variable) and it will compute pow(blocks[0], 65537, 18446744073709551557)
. The result of these computation is passed to int.__xor__
with the value at memo 16 (1244422970072434993
). The result of the pow
function is saved in memo 16 (overwriting the previous value).
Notice that we have some RSA encryption here…
Then, we have another set of conditions to continue with the loop:
767: g GET 2
770: g GET 3
773: g GET 4
776: g GET 5
779: g GET 3
782: g GET 7
785: M BININT2 1
788: \x86 TUPLE2
789: R REDUCE
790: p PUT 7
793: M BININT2 8
796: \x86 TUPLE2
797: R REDUCE
798: M BININT2 131
801: \x86 TUPLE2
802: R REDUCE
803: M BININT2 679
806: \x86 TUPLE2
807: R REDUCE
808: \x85 TUPLE1
809: R REDUCE
Basically, it will jump to line 679 until all blocks are encrypted. After that, the program will jump to line 810 (131 + 679
):
810: 0 POP
811: g GET 0
814: g GET 12
818: \x8c SHORT_BINUNICODE '__eq__'
826: \x86 TUPLE2
827: R REDUCE
828: ( MARK
829: I INT 8215359690687096682
850: I INT 1862662588367509514
871: I INT 8350772864914849965
892: I INT 11616510986494699232
916: I INT 3711648467207374797
935: I INT 9722127090168848805
956: I INT 16780197523811627561
978: I INT 18138828537077112905
1000: l LIST (MARK at 831)
1001: \x85 TUPLE1
1002: R REDUCE
1003: . STOP
And we have reached the end of the pickle
payload. Now, the program just checks that all encrypted blocks (which are computed on our_input
) match the ones that appear above.
Solution
Warning! Crypto stuff!
What we need to do now is to reverse the encryption from the expected outputs. We know that it is using RSA, where $e = 65537$ and $n = 18446744073709551557$. Actually, $n$ is a prime number, so it is not a safe RSA setup.
To decrypt RSA we will need $d = e^{-1} \mod{\phi(n)}$. Since $n$ is prime, then $\phi(n) = n - 1$, so we have everything in order to compute $d$. After that, we can decrypt messages as $m = c^d \mod{n}$.
Also, take into account that there is also a XOR operation applied on each block, and the value changes on each blocks (like in CBC mode).
In the end, we can build this reverse algorithm:
#!/usr/bin/env python3
x = 1244422970072434993
e = 65537
n = 18446744073709551557
d = pow(e, -1, n - 1)
ct = [
8215359690687096682,
1862662588367509514,
8350772864914849965,
11616510986494699232,
3711648467207374797,
9722127090168848805,
16780197523811627561,
18138828537077112905
]
flag = b''
for c in ct:
flag += (pow(c, d, n) ^ x).to_bytes(8, 'little')
x = c
print(flag.decode())
If we run this script, we will capture the flag:
$ python3 solve.py
SECCON{Can_someone_please_make_a_debugger_for_Pickle_bytecode??}
$ python3 problem.py
FLAG> SECCON{Can_someone_please_make_a_debugger_for_Pickle_bytecode??}
Congratulations!!