AmateursCTF2023逆向部分

July 21, 2023

大部分比较水，有几题有点价值，所以贴一下

volcano #

Inspired by recent "traumatic" events.
nc amt.rs 31010

过关要求输入 b, v, p 使得：

b % 22890 == 18476
17 <= binary_ones(v) <= 26
p % 2 == 1
digit_len(b) == digit_len(v)
digit_sum(b) == digit_sum(v)
4919^b % p == 4919^v % p

建立 z3 方程组并爆破条件 2，取条件 4 的十进制长度为 10

这个模板也可以用来排除 z3 的多解

def fn(m):
    n=m[vv].as_long()
    cnt=0
    while n:
        n&=(n-1)
        cnt+=1
    print(f"Tried {m[vv].as_long()=}, {cnt=}")
    return cnt>16 and cnt<=26

def z3gen(conds,fn):
    s=z3.Solver()
    s.add(conds)
    result=s.check()
    while result==z3.sat:
        m=s.model()
        if fn(m) is True:
            yield m
        # exclude the previous result by 'AND' a 'NOT'
		# 'NOT AND' is implemented by 'OR NEQ'
        extra=[var()!=m[var] for var in m]
        s.add(z3.Or(extra))
        result=s.check()

求出一组解 (b, v) = (1080403586, 9422911007)

对于条件 6 可以直接取 p=4919

amateursCTF{yep_th0se_l00k_th3_s4me_to_m3!_:clueless:}

ds_alg #

I was doing some homework for my Data Structures and Algorithms class,
but my program unexpectedly crashed when I entered in my flag. Could you help me get it back?
Here's the coredump and the binary, I'll even toss in the header file.
Can't give out the source code though, how do I know you won't cheat off me?

实现上的错误在于将链表的 ptr 字段 XOR 了

预计 flag 的内容仍然位于堆中

struct listnode {
    byte data;
    struct listnode *ptr;
};

data 和 ptr 之间应当有 7 字节的空位

直接搜索 61 00 00 00 00 00 00 00

amateursCTF{l1nk3d_bY_4n_xor}

rusteze #

Get rid of all your Rust rust with this brand new Rust-eze™ de-ruster.
Flag is amateursCTF{[a-zA-Z0-9_]+}

这题是带符号的，比较友好，

如果不带符号要注意 rust main 的位置，call 的是 std::rt::lang_start()，真 main 在它的第一个参数

amateursCTF{h0pe_y0u_w3r3nt_t00_ru5ty}

rusteze2 #

My boss said Linux binaries wouldn't reach enough customers so I was forced to make a Windows version.
Flag is amateursCTF{[a-zA-Z0-9_]+}

Windows 平台，不带符号，和 rusteze 结构其实差不多

自定义返回 rust 字符串的函数

rust_str __userpurge sub_140004E70@<rdx:rax>(rust_str *x@<rcx>)

按 rusteze 一样解会得到 sorry_this_isnt_the_flag_this_time.

而 76 行的 s2 很可疑，计算后的值一直驻留在内存中

debug 可以看到它就是 flag

amateursCTF{d0n3_4nd_deRust3d}

trick question #

Which one do you hate more: decompiling pycs or reading Python bytecode disassembly? Just kidding that's a trick question.
Run with Python version 3.10.
Flag is amateursCTF{[a-zA-Z0-9_]+}

首先用 pycdc 反编译，发现是一个 base64 加密过的 CodeObject

用 marshal 转储然后再过一遍 pycdc

FLAG 形如 0000000_111_2_333333_44_55555555555_666666

sn0h7YP 反过来
x+y-z==160,y+z-x==68,z+x-y==34：用 z3 解
单字符哈希已知：爆破
用 2 的单字符做 seed 然后调用 random.shuffle：可以造一个序号列表 shuffle 一下再根据序号逆向这个过程
直接已知
seed 不变然后 XOR randint
以 128 为进制解析 0x29ee69af2f3

amateursCTF{PY7h0ns_ar3_4_f4m1lY_0f_N0Nv3nom0us_Sn4kes}

headache #

Ugh... my head hurts...
Flag is amateursCTF{[a-zA-Z0-9_]+}

SMC，加密算法是 XOR，难点在于循环套了约 100 层

并且是否会继续解密下一层代码取决于 flag 是否通过该层检查，否则就返回 0

这是对抗拖内存的技巧

观察到 unpack 的终止标志是遇到 4 个 \x00，只需从函数尾部倒着遍历到首部寻找 4 个连续的 \x00 即可

找到 unpack 代码后需要额外提取 mov 指令的参数和 call 指令的参数

并把整个 unpack 块改成一个 jmp，用 nop 填充，要注意 4 个 \x00 哨兵会被解密成垃圾值，也要把它们 nop 掉

unpack 代码的模式（共八条）：

mov eax, :key
lea rsi, :start
xor [rsi], eax
add rsi, 4
cmp [rsi - 4], eax
jnz :label
call :start
ret

笔者这里提供使用 capstone + keystone 的脚本，另一种方式是用 idaapi

from ida_bytes import get_bytes, patch_bytes
from capstone import *
from keystone import *

BEGIN = 0x4012A4
END = 0x404394

text = bytearray(get_bytes(BEGIN, END - BEGIN))
md = Cs(CS_ARCH_X86, CS_MODE_64)
ks = Ks(KS_ARCH_X86, KS_MODE_64)

print("\n" + "=" * 80 + "\n")


def p32(x):
    return bytes([x & 0xFF, (x >> 8) & 0xFF, (x >> 16) & 0xFF, (x >> 24) & 0xFF])


cnt0 = 0
for ra in range(len(text) - 1, -1, -1):
    if text[ra] == 0:
        cnt0 += 1
    else:
        cnt0 = 0

    if cnt0 == 4:
        ra += cnt0
        va = ra + BEGIN
        insns = list(md.disasm(text[ra : ra + 40], offset=ra + BEGIN, count=8))

		# get xor key
        _mov = insns[0]
        assert _mov.mnemonic == "mov"
        key = _mov.op_str.split(", ")[1]
        key = int(key, 16)
        key_bytes = p32(key)

		# get start address of the encoded buffer
        _call = insns[-2]
        assert _call.mnemonic == "call"
        start = _call.op_str
        start = int(start, 16)
        start_ra = start - BEGIN

        print(f"{hex(va)}:\tkey: {hex(key)}\tnext layer: {hex(start)}")

        # unpack
        assert (va - start) % 4 == 0
        for i in range(start_ra, ra, 4):
            for j in range(4):
                text[i + j] ^= key_bytes[j]

        # patch
        size = sum(insn.size for insn in insns)
        # 4 zero sentinels also need to be patched
        text[ra - 4 : ra + size] = [0x90] * (size + 4)
        _jmp = ks.asm(f"jmp\t{hex(start)}", addr=_call.address)[0]
        _jmp_ra = _call.address - BEGIN
        text[_jmp_ra : _jmp_ra + len(_jmp)] = _jmp

        cnt0 = 0

patch_bytes(BEGIN, bytes(text))

顺利执行脚本后全选 sub_401290，然后按 c 分析为代码，覆盖先前的分析，再按 p 转化为函数

这样的约束有几百条，观察上面的 check 语句，可以知道约束的模式为

mov r15b, [rdi + :i]
xor r15b, [rdi + :j]
cmp r15b, :w

即 flag[i] ^ flag[j] == w

由于知道 flag 的前缀为 amateursCTF，笔者这里用 idaapi 将约束全部提取出来，然后 z3 梭哈

import ida_bytes
import idaapi
import z3
from Crypto.Util.number import long_to_bytes as l2b

BEGIN = 0x4012A4
END = 0x404394

print("\n" + "=" * 80 + "\n")

S = z3.Solver()
flags = [z3.BitVec(f"s{i}", 8) for i in range(61)]
flag = z3.Concat(*flags)

prefix = b"amateurs"
for i in range(len(prefix)):
    S.add(flags[i] == prefix[i])

ea = BEGIN
insn = idaapi.insn_t()
insn1 = idaapi.insn_t()
insn2 = idaapi.insn_t()

while ea < END:
    idaapi.decode_insn(insn, ea)

    if insn.get_canon_mnem() == "mov":
        idaapi.decode_insn(insn1, insn.ea + insn.size)
        idaapi.decode_insn(insn2, insn1.ea + insn1.size)
        if insn1.get_canon_mnem() == "xor" and insn2.get_canon_mnem() == "cmp":
            i = insn.Op2.addr  # mov r15b, [rdi + ${i}]
            j = insn1.Op2.addr  # xor r15b, [rdi + ${j}]
            w = insn2.Op2.value  # cmp r15b, ${w}
            print(f"{ea}:\ts[{i}]^s[{j}]=={w}")
            S.add(flags[i] ^ flags[j] == w)
            ea = insn2.ea + insn2.size
            continue

    ea += insn.size

S.check()
m = S.model()
solved = l2b(m.eval(flag).as_long())
print(solved)
#amateursCTF{i_h4v3_a_spli77ing_headache_1_r3qu1re_m04r_sl33p}

还有一种办法是建立加权图，边权重为节点的 XOR 值，然后根据已知的节点广度优先搜索

jvm #

I heard my professor talking about some "Java Virtual Machine" and its weird gimmicks, so I took it upon myself to complete one. It wasn't even that hard? I don't know why he was complaining about it so much.
Compiled with openjdk 11.0.19.
Run with java JVM code.jvm.

用 JEB 反编译 JVM.class

总共有 4 个寄存器，位宽为 8，非常简单的 VM

如果 opcode 无法找到将自解码后三个字节，再尝试匹配 opcode

字节码规则太简单就不贴了

这里笔者采用的方法是转译为 x64 然后用 angr 约束求解

选取 rsi 作为输入指针，rdi 作为输出指针，r12b ~ r15b 作为 4 个寄存器

实现输入压栈 lodsb, push rax

实现退栈打印 pop rax, stosb

实现自解码 prog[i], prog[i+1], prog[i+2] = prog[i]^prog[i+1]^prog[i+2], prog[i+1], prog[i]

转译的模板如下：（包括对 jmp 和 jcc 的重定位）

from keystone import *
from capstone import *
from pwn import p32


md = Cs(CS_ARCH_X86, CS_MODE_64)
ks = Ks(KS_ARCH_X86, KS_MODE_64)

with open("code.jvm", "rb") as fp:
    prog = bytearray(fp.read())

text = []
addr_map = {}
relocs = {}
r = {0: "r12b", 1: "r13b", 2: "r14b", 3: "r15b"}

# reimplement a JIT compiler
# void __usercall func(output@<rdi>, input@<rsi>)
ip = 0
va = len(text)
while ip < len(prog):
    insn, o1, o2 = (
        prog[ip],
        prog[ip + 1] if ip + 1 < len(prog) else None,
        prog[ip + 2] if ip + 2 < len(prog) else None,
    )
    addr_map[ip] = va

    def emit(s):
        global va, text
        text.extend(ks.asm(s, addr=va)[0])
        va = len(text)

    if insn in [0, 1, 2, 3]:
        emit(f"xchg {r[insn]}, {r[o1]}")
        ip += 2
    elif insn & 0xFE == 8:
        emit(f"add {r[o1]}, {r[o2] if insn & 1 else o2}")
        ip += 3
    elif insn & 0xFE == 12:
        emit(f"sub {r[o1]}, {r[o2] if insn & 1 else o2}")
        ip += 3
    elif insn & 0xFE == 16:
        emit(f"mul {r[o1]}, {r[o2] if insn & 1 else o2}")
        ip += 3
    elif insn & 0xFE == 20:
        emit(f"div {r[o1]}, {r[o2] if insn & 1 else o2}")
        ip += 3
    elif insn & 0xFE == 24:
        emit(f"mod {r[o1]}, {r[o2] if insn & 1 else o2}")
        ip += 3
    elif insn & 0xFE == 28:
        emit(f"shl {r[o1]}, {r[o2] if insn & 1 else o2}")
        ip += 3
    elif insn == 31:
        emit(f"lodsb")  # 将输入加载到寄存器
        emit(f"mov {r[o1]}, al")
        ip += 2
    elif insn == 32:  # 将输入压栈
        emit(f"lodsb")
        emit(f"push rax")
        ip += 1
    elif insn == 33:  # 打印寄存器内容
        emit(f"mov al, {r[o1]}")
        emit(f"stosb")
        ip += 2
    elif insn == 34:  # 退栈并打印
        emit(f"pop rax")
        emit(f"stosb")
        ip += 1
    elif insn == 41:
        emit(f"test {r[o1]}, {r[o1]}")
        relocs[va + 2] = o2
        emit(f"je 0x7fffffff")
        ip += 3
    elif insn == 42:
        emit(f"test {r[o1]}, {r[o1]}")
        relocs[va + 2] = o2
        emit(f"jne 0x7fffffff")
        ip += 3
    elif insn == 43:
        relocs[va + 1] = o1
        emit(f"jmp 0x7fffffff")
        ip += 2
    elif insn == 52:
        emit(f"mov al, {r[o1]}")
        emit(f"push rax")
        ip += 2
    elif insn == 53:
        emit(f"pop rax")
        emit(f"mov {r[o1]}, al")
        ip += 2
    elif insn == 54:
        emit(f"push {o1}")
        ip += 2
    elif insn == 127:
        emit(f"ret")
        ip += 1
    else:  # JIT SMC
        prog[ip] = insn ^ o1 ^ o2
        prog[ip + 2] = insn
        print(prog[ip], prog[ip + 1], prog[ip + 2])


for off in relocs:
    dest = addr_map[relocs[off]]
    rel = dest - (off + 4)
    rel = rel & 0xFFFF_FFFF
    text[off : off + 4] = p32(rel)


text = bytes(text)

debug = True

if debug:
    fp = open("asm.txt", "w")
    for insn in md.disasm(text, offset=0x1000):
        mnemonic = insn.mnemonic
        op_str = insn.op_str
        addr = insn.address
        disasm = f"{hex(addr)}:\t{mnemonic}\t{op_str}"
        fp.write(disasm + "\n")
    fp.close()

with open("code.bin", "wb") as fp:
    fp.write(text)

angr 梭哈，寻找 YES 分支（0x2a）并避开 NO 分支 (0x37)

实现上需要注意设置 rdi, rsi, r12b ~ r15b, rsp 等寄存器的初始值

import angr, claripy


base_addr = 0x1000
entry_point = 0x0 + base_addr
input_addr = 0x1000 + base_addr
output_addr = 0x2000 + base_addr
stack_addr = 0x3000 + base_addr

proj = angr.Project(
    "code.bin",
    main_opts={
        "base_addr": base_addr,
        "entry_point": entry_point,
        "backend": "blob",
        "arch": "amd64",
    },
)

begin = proj.factory.blank_state(addr=entry_point)
begin.regs.rsi = input_addr
begin.regs.rdi = output_addr
begin.regs.rsp = stack_addr - 0x100
begin.regs.r12 = 0
begin.regs.r13 = 0
begin.regs.r14 = 0
begin.regs.r15 = 0


def step_func(lsm):
    lsm.stash(
        filter_func=lambda state: state.addr == 0x37 + base_addr,
        from_stash="active",
        to_stash="avoid",
    )
    lsm.stash(
        filter_func=lambda state: state.addr == 0x2a + base_addr,
        from_stash="active",
        to_stash="found",
    )
    return lsm


simgr = proj.factory.simgr(begin)
simgr.explore(step_func=step_func, until=lambda lsm: len(lsm.found) > 0)
end = simgr.found[0]
flag = end.memory.load(input_addr, 256).chop(8)

solved = ""
for b in flag:
    c = end.solver.eval(b)
    if c == 0:
        break
    solved += chr(c)
print(solved)
#amateursCTF{wh4t_d0_yoU_m34n_j4v4_isnt_A_vm?}

未约束的输入由 angr 自动生成

由于不知道长度，输出的时候稍微麻烦一点点

emoji #

I apologize in advance.
Flag is amateursCTF{[a-z0-9_]+}
Compiled using the latest version of emojicode
Note that there may be multiple inputs that cause the program to print ✅.
The correct input (and the flag) has sha256 hash
53cf379fa8fd802fd2f99b2aa395fe8b19b066ab5e2ff49e44633ce046c346c4.

emojicode reference

举例解释算法：

如果输入为 abcd

abcd -> 0x61626364 -> 1100001 1100010 1100011 1100100

将其排入 16x16 矩阵，不够的用 0 填充，然后记录每行每列所有连续 1 的长度，如 1101101110 -> [2, 2, 3]

需要构造输入使得每行每列的结果数组恰为给定值

flag 开头永远是 amateursCTF{ 所以玩一下游戏即可

得到 16x16 比特流之后由于单个字符 6 比特还是 7 比特未知，需要深度优先搜索

from hashlib import sha256

enc = "".join(
    [
        "110111110100",
        "0110100111100111",
        "0111111100010110",
        "0101110110011011",
        "1111011101100111",
        "1110011101111111",
        "0100111011101110",
        "0111100011100100",
        "1100101101111111",
        "0110111010011101",
        "011100011",
    ]
)

charset = "0123456789_abcdefghijklmnopqrstuvwxyz"
sha256hex = "53cf379fa8fd802fd2f99b2aa395fe8b19b066ab5e2ff49e44633ce046c346c4"


def dfs(s, i):
    if i == len(enc):
        s = "amateursCTF{" + s + "}"
        if sha256(bytes(s, "utf-8")).hexdigest() == sha256hex:
            print(s)

    if i + 6 <= len(enc):
        x6 = chr(int(enc[i : i + 6], 2))
        if x6 in charset:
            dfs(s + x6, i + 6)
    if i + 7 <= len(enc):
        x7 = chr(int(enc[i : i + 7], 2))
        if x7 in charset:
            dfs(s + x7, i + 7)


dfs("", 0)
#amateursCTF{7his_belongs_ins1de_mi5c}

flagchecker #

I was making this simple flag checker in Scratch,
but my friend logged into my account and messed up all my variable names :(.
Can you help me recover my flag please?
You should run on Turbowarp for better performance.

stratch 3.0 项目

查询知道 .sb3 文件就是 zip

提取 project.json 然后根据正则反混淆变量名

import re
pat1 = re.compile(r"(DAT_694206942069420(0*))")

def rename(s):
    return 'VAR' + str(len(s.groups()[1]))

with open("./project.json.bak", "r") as fp:
    buf = re.sub(pat1, rename, fp.read())
    with open("./project.json", "w") as fq:
        fq.write(buf)

Turbowarp - Run Scratch projects faster

发现常数 0x9e3779b9，猜测是 TEA

密文需要在左边工具栏里选 flag 变量查看然后 export

from pwn import *

secret = [239, 202, 230, 114, 17, 147, 199, 39, 182, 230, 119, 248, 78, 246,
          224, 46, 99, 164, 112, 134, 30, 216, 53, 194, 60, 75, 223, 122, 67,
          202, 207, 56, 16, 128, 216, 142, 248, 16, 27, 202, 119, 105, 158,
          232, 251, 201, 158, 69, 242, 193, 90, 191, 63, 96, 38, 164]

key = [69420, 1412141, 1936419188, 1953260915]

def decrypt(v, k):
    total, delta = 0xc6ef3720, 0x9e3779b9
    v0, v1 = u32(v[0],endianness='big'), u32(v[1],endianness='big')
    k0, k1, k2, k3 = k[0], k[1], k[2], k[3]
    for i in range(32):
        v1 = (v1 - (((v0 << 4) + k2) ^ (v0 + total)
              ^ ((v0 >> 5) + k3))) & 0xffffffff
        v0 = (v0 - (((v1 << 4) + k0) ^ (v1 + total)
              ^ ((v1 >> 5) + k1))) & 0xffffffff
        total = (total - delta) & 0xffffffff
    return p32(v0,endianness='big') + p32(v1,endianness='big')

flag = b''
for idx in range(0, len(secret), 8):
    flag += decrypt([bytes(secret[idx:idx+4]), bytes(secret[idx+4:idx+8])], key)

print(flag)

amateursCTF{screw_scratch_llvm_we_code_by_hand_1a89c87b}

jsrev #

Someone wanted this, so I delivered. Have fun!
jsrev.amt.rs

源文件有 three.js

带了一个坐标文件 coords.json

three.js 是一个 3d 图形引擎，合理猜测这些坐标组成的 3d 散点图就是 flag

调用 pyplot 库绘制散点图

import matplotlib.pyplot as plt
import numpy as np
import json

with open("coords.json", "r") as fp:
    coords = json.load(fp)

X = np.array([c[0] for c in coords])
Y = np.array([c[1] for c in coords])
Z = np.array([c[2] for c in coords])

midx = (min(X) + max(X)) / 2
midy = (min(Y) + max(Y)) / 2
midz = (min(Z) + max(Z)) / 2
X = np.array([x - midx for x in X])
Y = np.array([y - midy for y in Y])
Z = np.array([z - midz for z in Z])

ax = plt.subplot(projection="3d")
ax.set_title("jsrev")
ax.scatter(X, Y, Z, s=2, c="g")

ax.set_xlabel("X")
ax.set_ylabel("Y")
ax.set_zlabel("Z")

ax.set_aspect('equal)

plt.show()

小球全部糅合在了一起，只需要在 x 轴或者 z 轴给一个 ky 的偏移区分开不同的字母即可

经过测试，合理的参数如下：（包括交换 x 轴 z 轴并反转）

# linear transform
XX = np.array([Z[i]-8*Y[i] for i in range(len(X))])
YY = np.array([0 for i in range(len(Y))])
ZZ = np.array([-X[i] for i in range(len(Z))])