目標唔係背晒全部。目標係見到 spec 之後,30 秒內知道自己應該去手冊邊度搵答案。
Step 1:先分 Group → Flat+Lazy / Nested+TTL / Ring+Capacity / StateMachine / Multi-Collection
Step 2:再分 Level 3 pattern → lazy helper / inline TTL / capacity / backoff / transfer
Step 3:最後分 L5-L6 → single-key lock / pair-lock / worker pool / fail-fast / all-sleep
| 如果你見到 spec 講 | 第一反應 | 去睇邊 section |
|---|---|---|
clockwise, ring, replica, virtual node | Hashring family | Hashring / ChatRoute / Group C |
request size, memory, ram, bandwidth | ChatRoute delta,load 係 MB 唔係 count | ChatRoute L2/L4/L6 |
field, scan, prefix, ttl | InMemDB / DNS family | InMemDB / DNS / Group B |
strictly less than | ts < expiry,到期嗰秒算死 | L3 TTL Variants |
remaining lifespan, restore | backup 存 remaining TTL,restore 重算 expiry | InMemDB L4 / DNS L4 |
worker, num_workers, queue | worker pool,唔係 one-op-per-coroutine | TaskQueue L5 / L6 Patterns |
dependency, blocked until | DAG / state machine | TaskQueue L4 / Workflow |
copy, transfer, upgrade | 兩個 key,要 sorted() 防 deadlock | L5 Concurrent Batch Variants |
without simulating, skip missing | fail-fast,check 喺 sem 前 | L6 Rate Limited Variants |
every request must attempt, even if missing | all-sleep,fail 都要 sleep | L6 Rate Limited Variants |
python asyncio semaphore gather、python sorted desc asc tie、python bisect right、python deepcopy| 你喺 notes / browser 應該搵嘅字 | 用途 |
|---|---|
fail-fast | 所有「invalid 唔 sleep」L6 題 |
all-sleep | 所有「失敗都要 call API」L6 題 |
worker pool | TaskQueue / fixed workers / queue 抢 task |
remaining_ttl | backup / restore with TTL |
sorted 防 deadlock | copy / transfer / upgrade / pair lock |
strictly less than | TTL boundary |
computed metric | qty * price / efficiency / revenue per night sorts |
state machine | QUEUED / PROCESSING / COMPLETED / FAILED 類 |
inline TTL | InMemDB / DNS 呢類唔真刪 pattern |
lazy helper | Bank / Hotel / retry / scheduled events |
搵字原則:唔好 search domain 全名先,先 search pattern 字。因為 domain 名你可能記錯,但 fail-fast、remaining_ttl、worker pool 呢啲唔會錯。
目標:你唔使靠感覺估 data structure。你只要睇 __init__ 同 L1 嘅 method signature,跟住答 3 條問題,就已經可以反推出大概骨架。
口訣:先認主角 ID → 再數 create 有幾多 ID → 最後分清邊啲係 ID、邊啲係 data。
| Step | 你要問自己咩 | 你會得到乜 |
|---|---|---|
問題 1 |
主角係咩?即係邊個 ID 喺最多 method signature 入面出現。 | 大概知道主 collection 應該圍住邊個 ID 建。 |
問題 2 |
create / add / register 嗰條 method 入面有幾多個 ID? |
大概知道係一層 dict、兩層 dict,定係 flat dict + data fields。 |
問題 3 |
create 入面邊啲係 ID,邊啲係真正要存落去嘅 data? | 知道內層個 value 應該長咩樣。 |
create_alert(timestamp, user_id, alert_id, severity, message)
acknowledge(timestamp, alert_id)
get_active_count(timestamp, user_id)
→ alert_id 出現 2 次
→ user_id 出現 2 次
→ 打和,未分到主角
→ 下一步睇 create_alert 入面有幾多個 ID
| create 入面有幾多個 ID | 你第一反應應該係咩 shape |
|---|---|
1 個 ID | self.items = {id: {field: val}} |
2 個 ID | self.items = {id1: {id2: {field: val}}} 或 flat dict |
3 個 ID | 好少見,通常都係兩層 + 一個 extra field,而唔係真 3 層 |
create_alert(timestamp, user_id, alert_id, severity, message)
^^^^^^^ ^^^^^^^^
兩個 ID
→ 第一反應:
呢題多數係兩層 dict
或者 flat alert dict,但入面要存返 user_id
create_alert(timestamp, user_id, alert_id, severity, message)
^^^^^^^ ^^^^^^^^ ^^^^^^^^ ^^^^^^^
ID ID data data
即係話:user_id 同 alert_id 係用嚟搵嘢;severity 同 message 先係真係要存落去嘅值。
兩個 ID + 兩個 data
→ 第一個直覺寫法:
self.alerts = {
第一個ID: {
第二個ID: {data1: val, data2: val},
},
}
create_alert(timestamp, user_id, alert_id, severity, message)
acknowledge(timestamp, alert_id)
get_active_count(timestamp, user_id)
拆法:
user_id 有用,alert_id 都有用;而 create 一次過帶兩個 ID,所以呢題第一反應就係 兩層 dict 或 flat alert dict。
| 方案 | Data Structure | 幾時會順手 |
|---|---|---|
方案 A |
|
如果你之後好多 query 都係「畀一個 user_id,睇佢底下所有 alerts」就順手。 |
方案 B |
|
如果你之後好多 query 都係「畀一個 alert_id,直接改 / ack / delete 呢張 alert」就順手。 |
答案:揀你之後 query 最順手嗰個。
唔係話永遠要用第一個參數做外層。你係睇後面 methods 點查嘢。如果大部分都係 acknowledge(alert_id) 呢種單 alert 操作,flat dict 其實更自然。
# 如果之後常用:
acknowledge(timestamp, alert_id)
delete_alert(timestamp, alert_id)
get_alert(timestamp, alert_id)
→ flat alert dict 通常順手啲
# 如果之後常用:
list_alerts(timestamp, user_id)
get_active_count(timestamp, user_id)
get_highest_severity(timestamp, user_id)
→ user 包 alert 嗰款通常順手啲
| 你見到咩 | 第一反應 |
|---|---|
create 只有 1 個 ID | self.items = {id: {field: val}} |
create 有 2 個 ID | self.items = {id1: {id2: {field: val}}} 或 flat dict |
1 個 ID + 好多 data fields | self.items = {id: {f1: v1, f2: v2, f3: v3}} |
| 好多 methods 都係用同一個 ID 直接 get / update / delete | 嗰個 ID 好可能係外層 key |
| 好多 methods 都係先畀大 ID,再喺入面搵細 ID | 兩層 dict 味好重 |
呢頁用途:將所有 init / L1 data shape 先分做 3 個 family。每張表我已經幫你起咗 L1-L6 六行位,你之後可以逐行補返 detail。
最核心:其實得 3 個 family。先分 family,再諗後面 TTL / backup / history / merge / lock 呢啲 add-on。
| # | Check | 結果 |
|---|---|---|
| 1 | 有幾多個 create / register method? | 2 個 → Family 3(兩個 dict) 1 個 → 去 #2 |
| 2 | 有冇 add_X method(加 sub-item 入去)? | 有 → Family 2(nested dict) 冇 → Family 1(flat dict) |
| 3 | 有冇 method 只帶一個 ID(冇另一個 ID)? | 有 → 睇下面 #3 詳細判斷 |
| 4 | L3 有冇 ttl_ms? | 有 → item 加 "expiry": timestamp + ttl |
| 5 | L4 有冇 backup / restore? | 有 → 加 self.backups = [] |
| 6 | L4 有冇 merge? | 有 → 加 self.merged_X = {} |
| 7 | L4 有冇 get_X_history? | 有 → 睇下面 History 判斷 |
| 8 | L5? | 加 self.locks = defaultdict(asyncio.Lock) |
| 9 | L6? | sem 喺 function 入面開(__init__ 唔使加) |
| Method 做乜 | 開 dict? | 原因 | 例子 |
|---|---|---|---|
改 / 刪(write)del / update / revoke |
開 → 值得 | 你要搵到佢然後改佢。 冇 dict → 要寫多層 for loop 搵。 有 dict → 一行搞掂。 |
remove_item(item_id) → del self.items[item_id]revoke_cert(cert_id) → del self.certs[cert_id]clear_violation(v_id) → self.violations[v_id]["status"] = "CLEARED" |
查 / 數(read)count / list / get_all |
唔使 → for loop 就得 | 本身就要行一圈數 / 收集。 有冇自己嘅 dict 都一樣要 for loop。 |
get_active_count(user_id) → for loop 數幾多個 matchget_chef_recipes(chef_id) → for loop 收集 match 嘅get_site_participants(site_name) → for loop 搵 |
old_id → new_id,記住「邊個合併咗去邊個」。# merge_study("old_study", "new_study") 做嘅嘢:
self.merged_studies[from_study] = to_study # 記住去向
del self.studies[from_study] # 刪 source
# merged dict 長咩樣:
self.merged_studies = {
"old_study": "new_study", # old 併入 new
"another_old": "new_study", # another 都併入 new
}
用途:merge 完之後有人 call get_X("old_study"),你查 merged_studies 就知佢已經併入 "new_study",可以 return error 或 redirect。| Domain | Dict 名 | 內容 |
|---|---|---|
| Bank | self.merged_accounts | {"bob": "alice"} |
| Parking | self.merged_lots | {"lot_b": "lot_a"} |
| Compliance | self.merged_entities | {"entity_old": "entity_new"} |
| Clinical Trial | self.merged_studies | {"study_old": "study_new"} |
{old_id: new_id},一模一樣。
get_X_history(some_id) → 你就知要加 history。| 情況 | 做法 | 例子 |
|---|---|---|
| history 嘅 ID == 某個現有 dict 嘅 key | 擺入嗰個 dict 入面 | get_car_history(lot_id)→ lot_id 就係 self.lots 嘅 key→ 寫入 lot 入面: |
| history 嘅 ID 邊個 dict 都唔係 | 獨立開 self.history = defaultdict(list) |
get_violation_history(entity_id)→ entity_id 唔係 self.policies 嘅 key→ entity_id 唔係 self.violations 嘅 key→ 冇 dict 用 entity_id 做 key → 獨立開: |
| Method | ID | 邊個 dict 嘅 key? | 擺邊 |
|---|---|---|---|
get_license_history(product_id) | product_id | == self.products 嘅 key ✓ | products[pid]["history"] |
get_signal_history(channel_id) | channel_id | == self.channels 嘅 key ✓ | channels[cid]["history"] |
get_booking_history(room_id) | room_id | == self.rooms 嘅 key ✓ | rooms[rid]["history"] |
get_violation_history(entity_id) | entity_id | 邊個都唔係 ✗ | self.history = defaultdict(list) |
get_content_history(content_id) | content_id | 邊個都唔係 ✗ | self.history = defaultdict(list) |
|
Family 1:一個 dict,冇 sub-item
完整 __init__(cover L1-L6)
點認
只有一個 create method,冇 add_X method。
例子
Bank、Leaderboard、Scheduler
|
||
| Level | 可能出嘅 method | 最新 __init__ + data structure |
|---|---|---|
L1 | CRUD — 開 / 讀 / 改 / 刪 Bank: create_account, deposit, get_balanceLeaderboard: register_player, update_scoreScheduler: create_event, cancel_eventSession: create_session, end_session | |
L2 | Sort / Query / 統計 Bank: top_spenders(n) — 按 outgoing 排Leaderboard: top_players(n) — 按 score 排Scheduler: get_next_event(ts)全部都係 for loop + sorted | |
L3 | TTL / 過期 / 狀態自動變 Session: create_session(ts, id, ttl) → 過期自動失效Scheduler: event 到期自動觸發 Bank: schedule_payment → 到期自動入賬每個 method 開頭 _purge_expired(ts) | |
L4 | Backup / History / Merge Bank: backup, restore, get_balance_at(time), merge_accountsLeaderboard: season_snapshot, season_restore有 TTL → backup 存 remaining_ttl,restore 重算 expiry merge → 數字加埋 + del source + 記 merged_items | |
L5 | Async Batch — lock + gather Bank: process_batch(ops)→ deposit/pay: async with locks[account_id]→ transfer: sorted([src, dst]) pair-lockLeaderboard: batch_update(ops) — 單 key lock | |
L6 | Rate Limited — sem + sleep Bank: process_external_transfers(transfers, max_concurrent)→ 餘額不足 → fail-fast(唔入 sem) → 夠錢 → 扣錢 → sem + sleep → True Leaderboard: sync_scores(ids, max_concurrent) | |
|
Family 2:一個 dict,入面有 sub-item
完整 __init__(cover L1-L6)
點認
有一個 create + 一個 add_X(add_ingredient、inject_signal、park_car)。
例子
Parking、InMemDB、Channel、Playlist、Hotel(room 入面有 guest)
|
||
| Level | 可能出嘅 method | 最新 __init__ + data structure |
|---|---|---|
L1 | CRUD — 開容器 + 加/刪 sub-item Parking: add_lot(lot_id, capacity), park_car(lot_id, car_id), remove_carInMemDB: create_db 唔使,直接 set(key, field, value), get, deleteDNS: add_record(domain, type, ip), delete_recordChannel: create_channel(id, max), inject_signal(ch_id, sig_id, strength) | |
L2 | Sort / 跨容器搵 sub-item Parking: find_car(car_id) — for loop 行所有 lot 搵Hashring: top_loaded(n) — 邊個 node 最多 keyDNS: get_records(domain) — 列出某 domain 所有 recordInMemDB: scan(key) — 列出某 key 所有 field | |
L3 | Sub-item 加 TTL Parking: park_car 加 ttl_ms → 過期自動離場InMemDB: set 加 ttl_ms → field 過期消失DNS: record 加 TTL → 過期唔 resolve 每個 method 開頭 purge expired subs | |
L4 | Backup(有 TTL 要存 remaining) / History / Merge InMemDB: backup → 每個 field 存 remaining_ttl;restore → 重算 expiryParking: get_car_history(lot_id) → 所有泊過嘅車FS: copy_file(src, dst) → copy remaining TTLHotel: upgrade_room(from, to) → move guest | |
L5 | Async Batch — per-item lock InMemDB: batch_operations(ops) → locks[key]FS: batch_ops → add/delete 單 lock,copy sorted pair-lockHotel: batch_ops → book/checkout 單 lock,upgrade pair-lockParking: park/remove 單 lock,transfer pair-lock | |
L6 | Rate Limited InMemDB: batch_scan(keys, max_concurrent) → all-sleep(全部都 sleep)DNS: propagate(domains, max_concurrent) → fail-fast(domain 唔存在即走)FS: sync_files(paths, max_concurrent) → fail-fastHashring: sync_replicas(reqs, max_concurrent) → fail-fast | |
|
Family 3:兩個 dict,兩種唔同嘅嘢
完整 __init__(cover L1-L6)
點認
有兩個唔同嘅 create / register method,或者有個 method 只帶一個 ID,但嗰個 ID 唔係主 dict 嘅 key。
例子
Compliance(policy + violation)、Workflow(workflow + step)、Recipe(recipe + chef)
|
||
| Level | 可能出嘅 method | 最新 __init__ + data structure |
|---|---|---|
L1 | CRUD — 兩種嘢各自 create Compliance: register_policy(p_id, desc, max) + flag_violation(v_id, p_id, entity, sev)Workflow: create_workflow(wf_id, steps) — steps 存喺 A,status 存喺 BSpectrum: register_band(b_id, freq_start, freq_end) + lease(b_id, op_id, lease_id)B 入面記 a_id 表示屬於邊個 A | |
L2 | Cross-lookup / Sort Compliance: find_violation(v_id) → return (policy_id, entity_id)Compliance: get_worst_entities(n) → for loop B 數每個 entity 幾多個 violationWorkflow: get_status(wf_id, step_id) → B[(wf_id, step_id)]Spectrum: get_operator_bands(op_id) → for loop B 搵 | |
L3 | B 加 TTL Compliance: violation 加 ttl_ms → 過期自動 resolveSpectrum: lease 加 ttl_ms → 過期自動 revokeModeration: report 加 ttl_ms → 過期自動 dismiss每個 method 開頭 purge expired B | |
L4 | Backup / History / Merge Compliance: backup/restore(violation 有 TTL → remaining_ttl)Compliance: get_violation_history(entity_id) → 跨多個 B,用 self.history = defaultdict(list)Compliance: merge_entity(from, to) → 搬所有 violation 去新 entitySpectrum: transfer_lease(lease_id, new_op) → 改 B 入面嘅 other_idWorkflow: get_history(wf_id), fail_step → rollback | |
L5 | Async Batch Compliance: batch_audit(ops) → flag/clear 鎖 locks[v_id],transfer 鎖兩個 entityModeration: batch_moderate(ops) → submit/claim/resolve 單 lock,escalate pair-lockSpectrum: batch_ops(ops) → lease/revoke 單 lock,transfer pair-lock | |
L6 | Rate Limited Compliance: report_violations(v_ids, max_concurrent) → 唔存在/已 cleared → fail-fast;active → sem + sleep + mark REPORTEDModeration: send_decisions(report_ids, max_concurrent) → 未 resolved → fail-fastSpectrum: sync_bands(band_ids, max_concurrent) → band 唔存在 → fail-fast | |
唔理邊個 family,L3-L5 多數都係疊加呢啲:
L3 有 TTL
→ sub-item / item 加 expiry field
L4 有 backup
→ self.backups = []
L4 有 history
→ item 入面加 history: []
或者 self.history = defaultdict(list)
L4 有 merge
→ self.merged_X = {}
L5
→ self.locks = defaultdict(asyncio.Lock)
park_car(add_X)→ Family 2 ✓ |
#3 find_car(car_id) 只帶 car_id → for loop 搵 ✓ |
#4 L3 有 TTL → car 加 expiry ✓ |
#5 L4 有 backup → backups=[] ✓ |
#6 L4 冇 merge ✗ |
#7 L4 有 history → lot 加 history:[] ✓ |
#8 L5 → locks ✓
L1: add_lot(lot_id, capacity)
park_car(lot_id, car_id)
remove_car(lot_id, car_id)
get_available_spots(lot_id)
L2: find_car(car_id) — which lot
get_fullest_lots(n)
L3: park_car 加 ttl_ms
get_parking_fee(lot_id, car_id)
L4: backup / restore(remaining_ttl)
get_car_history(lot_id)
L5: batch — park/remove 單 lock
transfer car pair-lock
L6: sync_lots(lot_ids, max_concurrent)
fail-fast
def __init__(self):
self.lots = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
self.lots = {
"lot1": {
"capacity": 5,
"cars": {
"car1": {"park_time": 100, "expiry": 5100},
"car2": {"park_time": 200, "expiry": None},
},
"history": ["car1", "car2", "car3"],
},
}
inject_signal(add_X)→ Family 2 ✓ |
#3 find_signal(signal_id) 只帶 signal_id → for loop 搵 ✓ |
#4 L3 有 TTL → signal 加 expiry ✓ |
#5 L4 有 backup → backups=[] ✓ |
#6 L4 冇 merge ✗ |
#7 L4 有 history → channel 加 history:[] ✓ |
#8 L5 → locks ✓
L1: create_channel(channel_id, max_signals)
inject_signal(channel_id, signal_id, strength)
drop_signal(channel_id, signal_id)
read_channel(channel_id)
L2: strongest_channels(n) — by total strength
find_signal(signal_id) — which channel
L3: inject_signal 加 ttl_ms
expired signal 唔計
L4: backup / restore(remaining_ttl)
get_signal_history(channel_id)
L5: batch — inject/drop 單 lock
transfer signal pair-lock
L6: sync_channels(channel_ids, max_concurrent)
fail-fast
def __init__(self):
self.channels = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
self.channels = {
"c1": {
"max_signals": 5,
"signals": {
"s1": {"strength": 80, "expiry": 5000},
"s2": {"strength": 40, "expiry": None},
},
"history": ["s1", "s2", "s3"],
},
}
claim_unit(add_X)→ Family 2 ✓ |
#3 get_job_claims(job_id) 只帶 job_id → for loop 搵 ✓ |
#4 L3 有 TTL → claim 加 expiry ✓ |
#5 L4 有 backup → backups=[] ✓ |
#6 L4 冇 merge ✗ |
#7 L4 有 history → pool 加 history:[] ✓ |
#8 L5 → locks ✓
L1: create_pool(pool_id, max_units)
claim_unit(pool_id, job_id, unit_count)
release(pool_id, job_id)
get_utilization(pool_id)
get_job_claims(job_id) — for loop all pools
L2: top pools by utilization
L3: claim 加 ttl_ms → expired claim auto-release
L4: backup / restore(remaining_ttl)
get_claim_history(pool_id)
L5: batch — claim/release 單 lock
transfer pair-lock
L6: sync_pools(pool_ids, max_concurrent)
fail-fast
def __init__(self):
self.pools = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
self.pools = {
"pool1": {
"max_units": 10,
"jobs": {
"job1": {"units": 3, "expiry": 8000},
"job2": {"units": 5, "expiry": None},
},
"history": ["job1", "job2"],
},
}
reserve(add_X)→ Family 2 ✓ |
#3 get_tenant_zones(tenant_id) 只帶 tenant_id → for loop 搵 ✓ |
#4 L3 有 TTL → reservation 加 expiry ✓ |
#5 L4 有 backup → backups=[] ✓ |
#6 L4 有 merge → merged_zones={} ✓ |
#7 L4 有 history → zone 加 history:[] ✓ |
#8 L5 → locks ✓
L1: create_zone(zone_id, total_slots)
reserve(zone_id, tenant_id, slot_count)
release(zone_id, tenant_id)
get_remaining(zone_id)
L2: get_tenant_zones(tenant_id) — for loop
get_busiest_zones(n)
L3: reserve 加 ttl_ms
extend(zone_id, tenant_id, extra_ms)
L4: backup / restore(remaining_ttl)
get_reservation_history(zone_id)
merge_zone(from, to)
L5: batch — reserve/release 單 lock
transfer pair-lock
L6: sync_zones(zone_ids, max_concurrent)
fail-fast
def __init__(self):
self.zones = {}
self.backups = []
self.merged_zones = {}
self.locks = defaultdict(asyncio.Lock)
self.zones = {
"z1": {
"total_slots": 10,
"tenants": {
"t1": {"slots": 3, "expiry": 8000},
"t2": {"slots": 5, "expiry": None},
},
"history": ["t1", "t2", "t3"],
},
}
self.merged_zones = {
"old_zone": "new_zone", # old 併入 new
}
add_ingredient(add_X)→ Family 2 ✓ |
#3 get_chef_recipes(chef_id) 只帶 chef_id → for loop 搵 ✓ |
#4 L3 有 TTL → ingredient 加 expiry ✓ |
#5 L4 有 backup → backups=[] ✓ |
#6 L4 冇 merge ✗ |
#7 L4 有 history → recipe 加 history:[] ✓ |
#8 L5 → locks ✓
L1: create_recipe(recipe_id, chef_id, cook_time)
add_ingredient(recipe_id, ingredient_name, qty)
remove_ingredient(recipe_id, ingredient_name)
get_recipe(recipe_id)
get_chef_recipes(chef_id) — for loop
L2: search recipes by ingredient
top recipes by ingredient count
L3: ingredient 加 ttl_ms(expires = 用完)
L4: backup / restore
get_recipe_history(recipe_id)
L5: batch — add/remove 單 lock
L6: sync_recipes(recipe_ids, max_concurrent)
fail-fast
def __init__(self):
self.recipes = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
self.recipes = {
"r1": {
"chef_id": "chef1",
"cook_time": 30,
"ingredients": {
"flour": {"qty": 200, "expiry": None},
"milk": {"qty": 100, "expiry": 5000},
},
"history": ["flour", "sugar", "milk"],
},
}
register_policy + flag_violation → Family 3 ✓ |
#3 clear_violation(v_id) 只帶 v_id → violations 要有自己嘅 flat dict ✓ |
#4 L3 有 TTL → violation 加 expiry ✓ |
#5 L4 有 backup → backups=[] ✓ |
#6 L4 有 merge_entity → merged_entities={} ✓ |
#7 L4 有 history(跨多個 violation)→ self.history=defaultdict(list) ✓ |
#8 L5 → locks ✓
L1: register_policy(policy_id, desc, max_violations)
flag_violation(policy_id, entity_id, violation_id, severity)
clear_violation(violation_id) ← 只有 v_id!
get_active_violations(entity_id)
L2: get_worst_entities(n) — for loop violations
find_violation(violation_id)
L3: flag_violation 加 ttl_ms
expired violation auto-resolve
L4: backup / restore(remaining_ttl)
get_violation_history(entity_id) — defaultdict(list)
merge_entity(from, to)
L5: batch — flag/clear 鎖 locks[v_id]
transfer_violation pair-lock
L6: report_violations(v_ids, max_concurrent)
唔存在/已 cleared → fail-fast
active → sem + sleep + mark REPORTED
def __init__(self):
self.policies = {} # A
self.violations = {} # B(flat!)
self.backups = []
self.merged_entities = {}
self.history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock)
self.policies = {
"p1": {"description": "...", "max_violations": 3},
}
self.violations = {
"v1": {
"policy_id": "p1",
"entity_id": "e1",
"severity": 4,
"expiry": 8000,
"status": "ACTIVE",
},
}
self.history = {
"e1": ["v1", "v2", "v3"],
}
self.merged_entities = {
"old_entity": "new_entity", # old 併入 new
}
create_queue + submit_report → Family 3 ✓ |
#3 claim_report(report_id) 只帶 r_id → reports 要有自己嘅 flat dict ✓ |
#4 L3 有 TTL → report 加 expiry ✓ |
#5 L4 有 backup → backups=[] ✓ |
#6 L4 冇 merge ✗ |
#7 L4 有 history(跨多個 report by content_id)→ self.history=defaultdict(list) ✓ |
#8 L5 → locks ✓
L1: create_queue(queue_id, priority_level)
submit_report(queue_id, report_id, content_id, reason)
claim_report(report_id, moderator_id) ← 只有 r_id!
resolve_report(report_id, decision)
get_pending_count(queue_id)
L2: get_moderator_workload(moderator_id) — for loop
get_busiest_queues(n)
find_report(report_id)
L3: submit_report 加 ttl_ms
get_report_age(report_id)
L4: backup / restore(remaining_ttl)
get_content_history(content_id) — defaultdict(list)
escalate(report_id, from_queue, to_queue)
L5: batch — submit/claim/resolve 單 lock
escalate pair-lock
L6: send_decisions(report_ids, max_concurrent)
未 resolved → fail-fast
resolved → sem + sleep + mark NOTIFIED
def __init__(self):
self.queues = {} # A
self.reports = {} # B(flat!)
self.backups = []
self.history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock)
self.queues = {
"q1": {"priority_level": 3},
}
self.reports = {
"r1": {
"queue_id": "q1",
"content_id": "c1",
"moderator_id": "mod1",
"reason": "spam",
"decision": None,
"expiry": 8000,
"status": "CLAIMED",
},
}
self.history = {
"c1": ["r1", "r2"],
}
register_band + lease → Family 3 ✓ |
#3 revoke(lease_id) 只帶 lease_id → leases 要有自己嘅 flat dict ✓ |
#4 L3 有 TTL → lease 加 expiry ✓ |
#5 L4 有 backup → backups=[] ✓ |
#6 L4 冇 merge ✗ |
#7 L4 有 history → band 加 history:[] ✓ |
#8 L5 → locks ✓
L1: register_band(band_id, freq_start, freq_end)
lease(band_id, operator_id, lease_id) ← lease_id 獨立!
revoke(lease_id) ← 只有 lease_id!
get_band_status(band_id)
L2: get_operator_bands(operator_id) — for loop
get_available_bands() — filter unleased
L3: lease 加 ttl_ms
get_remaining_lease(lease_id)
L4: backup / restore(remaining_ttl)
get_lease_history(band_id)
transfer_lease(lease_id, new_operator_id)
L5: batch — lease/revoke 單 lock
transfer pair-lock
L6: sync_bands(band_ids, max_concurrent)
band 唔存在 → fail-fast
def __init__(self):
self.bands = {} # A
self.leases = {} # B(flat!)
self.backups = []
self.locks = defaultdict(asyncio.Lock)
self.bands = {
"b1": {
"freq_start": 700,
"freq_end": 800,
"history": ["op1", "op2"],
},
}
self.leases = {
"lease1": {
"band_id": "b1",
"operator_id": "op1",
"expiry": 8000,
},
}
呢版就係 Generic Mock 入面個 GenericF1 拆成真 mock page。Family 1 = 一個 flat dict,冇 sub-item。
Family 1 你要記住 4 件事:
1. 只有一個主角 dict:self.items[item_id]
2. L1 係最基本 create / update / get / delete
3. L3 先加 expiry;L4 先加 backups / history / merge
4. L5 單 key 用一把 lock;transfer 呢類雙 key 要 pair-lock
適用題型:
- Bank:account_id -> {balance, outgoing, history}
- Leaderboard:player_id -> {score, history}
- Scheduler:event_id -> {execute_at, status}
- Session:session_id -> {user_id, expiry}
Bank / Leaderboard / Scheduler / Session
一見到:
- 只有一個 create
- 冇 add_sub / add_record / park_car 呢啲 second-layer method
- 大部分方法都係直接食 item_id
你就應該即刻諗:
self.items = {
item_id: {
field1,
field2,
expiry?,
history?,
}
}
import copy
import asyncio
from collections import defaultdict
class GenericF1:
def __init__(self):
self.items = {} # L1:主 dict,item_id -> item data
self.backups = [] # L4:snapshot list
self.merged_items = {} # L4:old_id -> new_id
self.locks = defaultdict(asyncio.Lock) # L5:per-item lock
self.items = {
"item1": {
"field1": 500,
"field2": 20,
"expiry": None,
"history": [(1000, 500)],
},
"item2": {
"field1": 200,
"field2": 0,
"expiry": 9000,
"history": [(2000, 200)],
},
}
self.backups = []
self.merged_items = {}
self.locks = defaultdict(asyncio.Lock)
L1: self.items
L3: item 入面加 expiry
L4: self.backups + item history + self.merged_items
L5: self.locks
L6: sem 喺 function 入面開,__init__ 唔使變
def create(self, timestamp, item_id, field1, field2): # 開一個新 item;好似喺總名冊度加一張全新卡
if item_id in self.items: # 呢個 id 已經有人霸咗位;即係同一個櫃桶名撞咗
return False # 已存在就拒絕;唔畀覆蓋舊人
self.items[item_id] = { # 真正落筆開卡;之後所有 level 都圍住呢張卡做加料
"field1": field1, # 主數值;通常係 balance / score / execute_at 呢類最核心嗰格
"field2": field2, # 第二格資料;可能係 outgoing / status / owner 呢類副資料
"expiry": None, # L1 先當永不過期;L3 先開始貼上到期時間
"history": [(timestamp, field1)], # L4 會用到;等於一開卡就順手影第一張相留底
}
return True # 成功開卡
def update(self, timestamp, item_id, field1): # 改 item 嘅主數值;好似拎起張卡改最新餘額
if item_id not in self.items: # 想改嗰張卡根本唔喺名冊入面
return None # 冇卡可改 → 直接話 caller 今次撲空
self.items[item_id]["field1"] = field1 # 將主數值改成最新版本;唔係加減,係直接覆蓋
self.items[item_id]["history"].append((timestamp, field1)) # 每改一次都記低時間同新值;方便之後問「嗰陣時幾多」
return self.items[item_id]["field1"] # 回傳改完之後最新嗰個值
def get(self, timestamp, item_id): # 查一張卡而家主數值係幾多
if item_id not in self.items: # 冇呢張卡
return None # 冇得答,只能回空結果
return self.items[item_id]["field1"] # 直接交返主數值;唔連成張 dict 一齊掟畀外面
def delete(self, timestamp, item_id): # 刪走一張卡;等於成個 item 從總名冊除名
if item_id not in self.items: # 本身都唔存在
return False # 所以刪唔到
del self.items[item_id] # 真正將成張卡由 dict 度抽走
return True # 刪除成功
def deposit(self, timestamp, item_id, amount): # 真題常見 variant:加錢 / 加分 / 加資源
if item_id not in self.items: # 主角唔存在就做唔到
return None
self.items[item_id]["field1"] += amount # 將數值直接加上去
self.items[item_id]["history"].append((timestamp, self.items[item_id]["field1"])) # 新值一樣要留底
return self.items[item_id]["field1"]
def transfer(self, timestamp, source_id, target_id, amount): # pair-op variant:同時改兩張卡,但都仲係 Family 1
if source_id not in self.items or target_id not in self.items or source_id == target_id: # 任一張卡唔啱都即走
return None
self.items[source_id]["field1"] -= amount # source 扣走
self.items[target_id]["field1"] += amount # target 加返
self.items[source_id]["history"].append((timestamp, self.items[source_id]["field1"])) # source 新值留底
self.items[target_id]["history"].append((timestamp, self.items[target_id]["field1"])) # target 新值留底
return self.items[source_id]["field1"]
item = self.items.get(item_id) # Step 1:先認主角永遠係一張 flat item 卡
if op == "create": check_duplicate = item_id in self.items # Step 2:create 先 check duplicate
if op != "create": check_missing = item is None # Step 3:其餘全部先 check missing
self.items[item_id] = {...} # Step 4:create / update 都係直接打 self.items[item_id]
result = self.items[item_id]["field1"] # Step 5:回傳 shape 要跟 spec,通常係值 / True / None
def __init__(self):
self.items = {}
self.items = {
"item1": {
"field1": 500,
"field2": 20,
},
"item2": {
"field1": 200,
"field2": 0,
},
}
只得 self.items
create / update / get / delete 全部都係直接打:
self.items[item_id]
呢級仲未用:
- self.backups
- self.merged_items
- self.locks
冇第二層 dict
冇 sub_id
冇 tuple key
Flat dict 就係:
item_id → item info
create(timestamp, item_id, field1, field2)
get(timestamp, item_id)
update(timestamp, item_id, new_value)
delete(timestamp, item_id)
# 同 family 常見改名:
create_account / register_player / create_event / create_session
deposit / add_score / reschedule / renew
close / cancel / end / revoke
# 仍然屬於 Family 1,但係雙 key op:
transfer(timestamp, source_id, target_id, amount)
重點:
- 只打一個 item_id → 單 item CRUD
- 一次過打 source_id + target_id → 仍然係 flat dict family,只係變咗 pair-op
考點:
- flat dict CRUD 骨架穩唔穩
- duplicate / missing check 有冇漏
- return shape 有冇跟 spec
Step 1:先認主角係 self.items[item_id]
Step 2:create 先 check duplicate;其餘 method 先 check missing
Step 3:開 item 時將基礎 fields 一次過寫齊
Step 4:update / get / delete 都直接打 self.items[item_id]
Step 5:見到 transfer 呢類雙 id op,都唔好誤判做另一個 family
def top_n(self, timestamp, n): # 揀頭 n 張最勁嘅卡;等於排龍虎榜
items = [] # 先開個臨時排行榜草稿紙
for item_id, info in self.items.items(): # 逐張卡巡一次;將每個人嘅分數搬去 sorting 用嘅 tuple
items.append((-info["field1"], item_id)) # 前面落負號 = 想大數排先;後面放 id = 同分時按字母 / id 順
items.sort() # Python 會先比第一格,再比第二格;即係 score desc + id asc
result = [] # 準備正式榜單
for value, item_id in items[:n]: # 只拎排頭嗰 n 個;後面嗰堆唔使理
result.append(item_id) # generic 版先淨係交 id;真題可以改成 "alice(300)" 呢類 format
return result # 回傳 top n 名單
def items_above(self, timestamp, min_value): # filter variant:搵所有高過某條線嘅 item
result = [] # 先開結果 list
for item_id, info in self.items.items(): # 逐張卡睇一次
if info["field1"] >= min_value: # 達標先收
result.append(item_id)
result.sort() # 呢類多數最後字母順交返
return result
def get_next_item(self, timestamp): # next / earliest variant:用 field1 當時間
best = None # 暫時未揀中
for item_id, info in self.items.items(): # 巡一次所有 item
candidate = (info["field1"], item_id) # 先用 metric,再用 id tie-break
if best is None or candidate < best: # 越細越前
best = candidate
return None if best is None else best[1]
rows = [] # Step 1:L2 先 collect,唔好一開始就 format
rows.append((-metric, item_id)) # Step 2:top-N 通常砌 (-metric, id)
rows.sort() # Step 3:先 sort,再 cut 頭 n 個
filtered = [item_id for item_id, info in self.items.items() if keep(info)] # Step 4:filter 題就直接 for loop
return formatted_result # Step 5:最後先決定回 list / string / count
def __init__(self):
self.items = {}
self.items = {
"alice": {"field1": 300, "field2": 20},
"bob": {"field1": 500, "field2": 10},
"cara": {"field1": 500, "field2": 30},
}
排行 / filter 都只係讀 self.items
top_n(..., 2)
→ collect (-field1, item_id)
→ sort
→ ["bob", "cara"]
items_above(..., 250)
→ ["alice", "bob", "cara"]
top_n(timestamp, n)
list_sorted(timestamp)
items_above(timestamp, min_value)
get_next_item(timestamp)
count_active(timestamp)
list_by_status(timestamp, status)
# 常見真題名:
top_spenders / top_players / players_above
get_next_event / list_upcoming
get_active_sessions / list_live_items
核心都係:
collect -> sort/filter -> format/return
考點:
- 排序 metric 係邊個 field
- tie-break 係咪用 id / name
- 最後係回 list、string,定 count
Step 1:先諗係 top-N、range filter,定 next/earliest query
Step 2:for loop 收集 tuple,例如 (-metric, item_id)
Step 3:sort 完先 cut 頭 n 個,或者逐個 filter
Step 4:要 format 就最後先砌字串
Step 5:記住「先 collect,再 sort,再 format」,唔好倒轉做
def _purge(self, timestamp): # L3 先引入 helper;因為而家先開始有 expiry 呢個概念
for item_id in list(self.items.keys()): # 用 list(...) 包住 keys;即係先抄一份名單,再逐個睇
exp = self.items[item_id]["expiry"] # 攞呢張卡嘅到期時間出嚟;None 代表永遠有效
if exp is not None and timestamp >= exp: # 而家時間已經踩線 / 過線;等於呢張卡過咗期
del self.items[item_id] # 懶清潔工模式:見到死卡就即場掉落垃圾桶
def create_with_ttl(self, timestamp, item_id, field1, field2, ttl_ms): # 開卡時順手加有效期
self._purge(timestamp) # 每次做新嘢前先掃一次地;唔好畀死卡留喺場搞亂局
if item_id in self.items: # 掃完地之後如果同名卡仍然存在,即係真係撞名
return False
expiry = None # 預設當永久卡;如果 caller 冇畀 ttl,就唔會自己死
if ttl_ms is not None: # 有指定存活時間
expiry = timestamp + ttl_ms # 將「仲有幾耐命」換算成「幾時死」
self.items[item_id] = {
"field1": field1, # 主資料照存
"field2": field2, # 副資料照存
"expiry": expiry, # 關鍵新增位;之後 helper 就靠佢決定生死
"history": [(timestamp, field1)], # 開卡第一刻都一樣留低歷史相片
}
return True # TTL 卡開成功
def extend(self, timestamp, item_id, extra_ms): # extension variant:將壽命向後推
self._purge(timestamp) # 先清場,死咗先唔好再續命
if item_id not in self.items or self.items[item_id]["expiry"] is None: # 冇 item 或本身冇 TTL
return False
self.items[item_id]["expiry"] += extra_ms # 直接喺舊 expiry 上加
return True
def process_due(self, timestamp): # due-time variant:到鐘觸發副作用
for item_id in list(self.items.keys()): # 逐張卡巡一次
exp = self.items[item_id]["expiry"] # 攞到期時間
if exp is not None and timestamp >= exp: # 到鐘就做事
self.items[item_id]["field2"] += 1 # generic 版示範:加一格副作用值
self.items[item_id]["expiry"] = None # 做完就熄鐘,避免重覆觸發
expiry = timestamp + ttl_ms # Step 1:先計絕對 expiry
self._purge(timestamp) # Step 2:每個 public method 開頭先 call helper
if expired: delete_or_mark_inactive # Step 3:決定到鐘係 delete 定 status change
if extend: self.items[item_id]["expiry"] += extra_ms # Step 4:續命係加喺舊 expiry 上
if due_side_effect: mutate_local_state # Step 5:due-time 題記住要做副作用
def __init__(self):
self.items = {}
self.items = {
"item1": {"field1": 500, "field2": 20, "expiry": 9000},
"item2": {"field1": 200, "field2": 0, "expiry": None},
}
關鍵只係 item 入面多咗:
"expiry"
expiry = timestamp + ttl_ms
之後每個 public method 開頭:
self._purge(timestamp)
到鐘就 delete / inactive
睇題目 spec 決定
create_with_ttl(timestamp, item_id, ..., ttl_ms)
update_with_ttl(timestamp, item_id, ..., ttl_ms)
extend(timestamp, item_id, extra_ms)
renew(timestamp, item_id)
expire(timestamp, item_id)
# 時間驅動 variant:
schedule_payment / process_cashbacks
create_session / touch_session / expire_session
create_event / process_due_events
Family 1 L3 常見兩款:
1. expiry 到咗就 delete / inactive
2. 由 helper 喺 method 開頭順手處理時間副作用
考點:
- 係真 TTL,定 due-time 副作用題
- helper 係 purge 定 process_due
- expiry / due time 放喺邊格
Step 1:見到 ttl_ms / scheduled / cashback 呢啲字,先切去時間題腦區
Step 2:決定 helper 風格:_purge 定 _process_due_xxx
Step 3:新增 expiry / due 欄位
Step 4:每個 public method 開頭先 call helper
Step 5:清楚到鐘係 delete、mark inactive,定改 balance / status
def backup(self, timestamp): # 影一張當下全景相;之後 restore 就靠呢張相倒帶
self._purge(timestamp) # 影相前先掃走死卡;唔好將過期垃圾都影埋入備份
snapshot = copy.deepcopy(self.items) # 冇 TTL 特別玩法時,最穩陣就係成份 dict 深拷貝
self.backups.append((timestamp, snapshot)) # 連影相時間一齊收埋;之後先知邊張相最近 target_ts
def restore(self, timestamp, target_ts): # 還原返 target_ts 或之前最近嗰張相
best = None # 暫時仲未揀到要用邊張備份
for backup_ts, snapshot in self.backups: # 逐張舊相行一次
if backup_ts <= target_ts: # 呢張相喺 target_ts 或之前,先有資格做候選
best = (backup_ts, snapshot) # 一路覆蓋到最後,最後留低嘅就係最近嗰張
if best is None: # 一張合資格嘅都冇
return False
self.items = copy.deepcopy(best[1]) # 成份 state 倒帶返去;等於將成個場景換返舊版本
return True
def get_value_at(self, timestamp, item_id, time_at): # 想知某張卡喺某個舊時刻個值係幾多
self._purge(timestamp) # 先保持現場乾淨;雖然查舊值,但系統本身狀態都要先整理
if item_id not in self.items: # 呢張卡而家都唔存在
return None
for ts, val in reversed(self.items[item_id]["history"]): # 由最新歷史倒住行;因為最近嗰筆最有機會啱
if ts <= time_at: # 一見到時間冇超過目標時刻,就代表呢筆係當時有效狀態
return val
return None # 連一筆舊紀錄都搵唔到
def merge(self, timestamp, id1, id2): # 將第二張卡嘅資產 / 數值合併去第一張卡
self._purge(timestamp) # 合併前先掃走死卡;免得同一張屍體卡融合
if id1 not in self.items or id2 not in self.items or id1 == id2: # 任一張唔存在,或者自己吞自己
return False
self.items[id1]["field1"] += self.items[id2]["field1"] # 將 source 嗰份數字倒去 target;好似兩個錢包倒埋一個
self.items[id1]["history"].append((timestamp, self.items[id1]["field1"])) # 合併後新總數都要留歷史
self.merged_items[id2] = id1 # 記低「舊 id2 去咗 id1」;之後查舊人時可以追蹤去向
del self.items[id2] # source 張卡完成任務,正式收皮
return True
def backup_with_ttl(self, timestamp): # TTL backup variant:唔存絕對 expiry,要存 remaining_ttl
snap = {}
for item_id, info in self.items.items(): # 巡全部活 item
remaining_ttl = None if info["expiry"] is None else info["expiry"] - timestamp # 先轉成剩餘壽命
snap[item_id] = {
"field1": info["field1"],
"field2": info["field2"],
"remaining_ttl": remaining_ttl,
"history": list(info["history"]),
}
self.backups.append((timestamp, snap)) # 一樣用 (timestamp, snapshot)
return str(len(snap))
def get_history(self, timestamp, item_id): # history list variant:直接交整條歷史帶
if item_id not in self.items: # 冇卡就冇歷史
return None
return list(self.items[item_id]["history"]) # copy 一份出去
snapshot = copy.deepcopy(self.items) # Step 1:無 TTL backup 直接 deepcopy
remaining_ttl = expiry - timestamp # Step 2:有 TTL 就轉 remaining_ttl
best = latest_snapshot_before(target_ts) # Step 3:restore 永遠揀 target_ts 或之前最近嗰張
for ts, val in reversed(history): ... # Step 4:value-at-time 用 reversed loop
self.merged_items[source_id] = target_id # Step 5:merge 記 mapping,再 del source
def __init__(self):
self.items = {}
self.backups = []
self.merged_items = {}
self.items = {
"item1": {
"field1": 700,
"field2": 20,
"expiry": None,
"history": [(1000, 500), (3000, 650), (5000, 700)],
},
}
self.backups = [
(3000, {
"item1": {"field1": 650, "field2": 20, "expiry": None, "history": [(1000, 500), (3000, 650)]},
"item2": {"field1": 200, "field2": 0, "expiry": None, "history": [(2000, 200)]},
}),
(5000, {
"item1": {"field1": 700, "field2": 20, "expiry": None, "history": [(1000, 500), (3000, 650), (5000, 700)]},
}),
]
self.merged_items = {
"item2": "item1",
}
backup 用:
self.backups = [(timestamp, snapshot), ...]
history 查舊值用:
self.items[item_id]["history"]
merge 用:
self.merged_items[old_id] = new_id
即係 L4 唔係得 self.items
而係 items + backups + merged_items 一齊睇
backup(timestamp)
restore(timestamp, target_ts)
backup_with_ttl(timestamp) # 存 remaining_ttl
restore_with_ttl(timestamp, target_ts)
get_value_at(timestamp, item_id, time_at)
get_history(timestamp, item_id)
merge(timestamp, id1, id2)
# 常見真題名:
season_snapshot / season_restore
get_balance_at / get_score_at
merge_accounts / merge_profiles
你要識分:
- 無 TTL backup = deepcopy
- 有 TTL backup = remaining_ttl
- merge = 加埋 / 搬埋 / del source
考點:
- snapshot 應該點存
- history 係 value-at-time 定 event list
- merge 之後 source 去向點處理
Step 1:先判有冇 TTL;有 TTL 就諗 remaining_ttl
Step 2:backup 一律存成 (timestamp, snapshot)
Step 3:restore 要搵 target_ts 或之前最近嗰張
Step 4:history 查詢用 reversed loop
Step 5:merge 題記住「改 target + 記 mapping + del source」三連
async def batch(self, timestamp, operations): # 將一堆 op 一次過丟出去跑;但同一張卡仍然要守秩序
self._purge(timestamp) # 開波前先清場;唔好畀過期 item 夾雜喺 batch 入面
async def execute_op(op): # 每張單自己各自跑呢個 inner worker
if op["type"] == "transfer": # 呢種係雙 key op;同時掂 source 同 target
keys = sorted([op["source_id"], op["target_id"]]) # 先排序;全世界都跟同一個上鎖順序,避免 deadlock
async with self.locks[keys[0]]: # 先鎖字典序較前嗰張卡
async with self.locks[keys[1]]: # 再鎖第二張卡;兩張卡都鎖住先可以安全過數
s = self.items.get(op["source_id"]) # 攞 source 卡
t = self.items.get(op["target_id"]) # 攞 target 卡
if not s or not t: # 任一張卡唔存在,今次 transfer 即刻作廢
return None
amount = op["amount"] # 今次要搬幾多
s["field1"] -= amount # source 減錢;等於左手銀包拎走一嚿錢
t["field1"] += amount # target 加錢;等於右手銀包收返同一嚿錢
return s["field1"] # generic 版回 source 最新值
key = op["item_id"] # 其餘大部分 op 都只掂一張卡
async with self.locks[key]: # 鎖住嗰張卡;避免兩條 coroutine 同時改同一格
if op["type"] == "create":
return self.create(timestamp, op["item_id"], op["field1"], op.get("field2", 0))
if op["type"] == "update":
return self.update(timestamp, op["item_id"], op["field1"])
if op["type"] == "delete":
return self.delete(timestamp, op["item_id"])
return None # 如果 type 唔認得,就交空結果
tasks = [execute_op(op) for op in operations] # 將每張單變成 coroutine;等於一次過開晒好多條工作線
results = await asyncio.gather(*tasks) # 等全部單一齊跑完先收數
return list(results) # gather 回 tuple-like 結果;轉 list 比 caller 好用啲
async def batch_get(self, timestamp, item_ids): # read-only batch variant:每張卡各自上鎖再 get
async def do_one(item_id):
async with self.locks[item_id]: # 即使係 read,有啲題都照跟 item lock
return self.get(timestamp, item_id)
return list(await asyncio.gather(*(do_one(item_id) for item_id in item_ids)))
async def batch_update(self, timestamp, ops): # update-only batch variant:全部都係單 key
async def do_one(op):
async with self.locks[op["item_id"]]: # 一張單一把 lock
return self.update(timestamp, op["item_id"], op["field1"])
return list(await asyncio.gather(*(do_one(op) for op in ops)))
async def execute_op(op): ... # Step 1:先寫 inner worker
scope = one_item_or_two_items(op) # Step 2:先判斷 shared state scope
lock = self.locks[item_id] # Step 3:Pattern A 直接一把 lock
keys = sorted([source_id, target_id]) # Step 4:Pattern B 先 sorted 再 pair-lock
results = await asyncio.gather(*tasks) # Step 5:最後 gather,保持輸入順序
def __init__(self):
self.items = {}
self.backups = []
self.merged_items = {}
self.locks = defaultdict(asyncio.Lock)
self.items = {
"alice": {"field1": 500, "field2": 20, "expiry": None, "history": [(1000, 500)]},
"bob": {"field1": 200, "field2": 0, "expiry": None, "history": [(2000, 200)]},
}
self.locks["alice"] = <Lock>
self.locks["bob"] = <Lock>
資料本身:
self.items
並發控制:
self.locks[item_id]
Pattern A:單 key
lock = self.locks[item_id]
Pattern B:pair-lock
keys = sorted([source_id, target_id])
batch(timestamp, operations)
process_batch(timestamp, operations)
batch_update(timestamp, ops)
batch_delete(timestamp, ops)
batch_get(timestamp, item_ids)
# op type 常見:
create / update / delete / get
deposit / withdraw / pay
transfer(source_id, target_id, amount)
Pattern A:
單 item op -> lock[item_id]
Pattern B:
同時改兩個 item -> sorted pair-lock
考點:
- op 會改一個 item 定兩個 item
- lock 應該跟 item_id、source/target,定其他 key
- 點樣包裝舊 sync method 入 async
Step 1:先寫 async def execute_op(op)
Step 2:數張 op 會掂幾多個 item
Step 3:單 key 用一把 lock;雙 key 先 sorted 再 pair-lock
Step 4:lock 入面直接 call 舊 method / 舊邏輯
Step 5:最後 gather 全部 op,保持原本結果順序
async def sync(self, timestamp, item_ids, max_concurrent): # 一次過將好多 item 推去外面同步;但同一時間只畀 N 個出門
self._purge(timestamp) # 先清走死卡;唔好浪費外部 quota 喺屍體上面
sem = asyncio.Semaphore(max_concurrent) # sem = 閘機;同一時間最多放 max_concurrent 個入去
async def do_one(item_id): # 每個 item 各自排隊過閘
if item_id not in self.items: # 呢張卡根本唔存在
return False # fail-fast:未去到閘口就即刻遣返,連 sleep 都慳返
async with sem: # 真正過關先佔用一個外部名額
await asyncio.sleep(0.01) # 模擬打 API / 寫 disk / 出網路;即係真外部成本
return True # 過關完成
tasks = [do_one(item_id) for item_id in item_ids] # 每個 item 一條 async 線;全部先排好
results = await asyncio.gather(*tasks) # 一齊等佢哋返嚟
return list(results) # 逐個 item 對應 success / fail 結果
async def process_external_transfers(self, timestamp, transfers, max_concurrent): # 進階 L6 variant:本地先扣數,再出面 sleep
sem = asyncio.Semaphore(max_concurrent) # 外部 quota 閘口
async def do_one(transfer):
source_id = transfer["source_id"] # 搵 source
target_id = transfer["target_id"] # 搵 target
keys = sorted([source_id, target_id]) # 先統一 pair-lock 順序
async with self.locks[keys[0]]: # 第一把鎖
async with self.locks[keys[1]]: # 第二把鎖
if source_id not in self.items or target_id not in self.items: # 本地 fail-fast
return False
self.items[source_id]["field1"] -= transfer["amount"] # 先扣 source
self.items[target_id]["field1"] += transfer["amount"] # 再加 target
async with sem: # 本地成功之後先出外面
await asyncio.sleep(0.01) # 模擬真 external transfer call
return True
return list(await asyncio.gather(*(do_one(t) for t in transfers)))
sem = asyncio.Semaphore(max_concurrent) # Step 1:function 入面先開 sem
if invalid: return False # Step 2:fail-fast 永遠喺 sem 前做
if need_local_mutation: mutate_under_lock_first # Step 3:有啲題要先改本地 state
async with sem: await asyncio.sleep(0.01) # Step 4:過關先做 external work
return list(await asyncio.gather(*tasks)) # Step 5:最後 gather 成批結果
def __init__(self):
self.items = {}
self.backups = []
self.merged_items = {}
self.locks = defaultdict(asyncio.Lock)
self.items = {
"alice": {"field1": 500, "field2": 20, "expiry": None, "history": [(1000, 500)]},
"bob": {"field1": 200, "field2": 0, "expiry": None, "history": [(2000, 200)]},
}
# 注意:
# sem 唔係 __init__ state
# sem = asyncio.Semaphore(max_concurrent) 係 function 入面臨時開
持久 state:
- self.items
- self.locks
臨時 runtime state:
- sem
fail-fast:
- item 唔存在 → 直接 False
- 唔入 sem
- 唔 sleep
sync(timestamp, item_ids, max_concurrent)
process_external(timestamp, items, max_concurrent)
sync_scores(timestamp, player_ids, max_concurrent)
process_external_transfers(timestamp, transfers, max_concurrent)
L6 常見 2 款:
1. fail-fast:check 唔過即走
2. fail-fast + local lock + external sem:先改本地 state,再出面 sleep
如果 spec 話「全部都要 attempt」:
都可以變 all-sleep,只係 Family 1 冇咁常見
考點:
- sem 放邊度開
- fail-fast 係咪喺 sem 前做
- local mutation 同 external call 要唔要拆開
Step 1:function 入面先開 sem = asyncio.Semaphore(max_concurrent)
Step 2:寫 do_one(...) inner async worker
Step 3:先做 existence / status / balance 之類 fail-fast check
Step 4:如果要改本地 state,先 lock / 先扣數,再出面 sleep
Step 5:最後 gather 全部結果,逐條 input 對返逐條 output
呢版就係 Generic Mock 入面個 GenericF2 拆成真 mock page。Family 2 = container 入面有 sub-item。
Family 2 你要見字即反應:
outer dict = container
inner dict = sub-item
典型 signature:
- create_container(container_id, capacity)
- add_sub(container_id, sub_id, data)
- remove_sub(container_id, sub_id)
- find_sub(sub_id)
適用題型:
- InMemDB:key 入面有 field
- DNS:domain 入面有 record type
- Parking:lot 入面有 cars
- PubSub:topic 入面有 subscribers / messages
self.containers = {
"container1": {
"capacity": 5,
"subs": {
"sub1": {"data": 100, "expiry": None},
"sub2": {"data": 50, "expiry": 9000},
},
"history": ["sub1", "sub2"],
},
}
import copy
import asyncio
from collections import defaultdict
class GenericF2:
def __init__(self):
self.containers = {} # container_id -> {capacity, subs, history}
self.backups = [] # L4 snapshot list
self.merged_containers = {} # L4 merge 記錄(如果題目有)
self.locks = defaultdict(asyncio.Lock) # L5 per-container lock
def __init__(self):
self.containers = {}
self.backups = []
self.merged_containers = {}
self.locks = defaultdict(asyncio.Lock)
self.containers = {
"lot1": {
"capacity": 2,
"subs": {
"car1": {"data": 100, "expiry": None},
"car2": {"data": 50, "expiry": 9000},
},
"history": ["car1", "car2"],
},
}
self.backups = []
self.merged_containers = {}
self.locks = defaultdict(asyncio.Lock)
L1:self.containers[container_id]["subs"][sub_id]
L3:每個 sub 多 "expiry"
L4:self.backups + container["history"]
L5:self.locks[container_id]
L6:sem 係 function 入面臨時開,唔入 __init__
def create_container(self, timestamp, container_id, capacity): # 先開個大容器;等於先起個櫃桶 / 房 / topic
if container_id in self.containers: # 呢個大容器名已經有人用緊
return False # 撞名就唔畀重開
self.containers[container_id] = {
"capacity": capacity, # 呢個大容器最多裝幾多件 sub-item
"subs": {}, # 內層真正擺貨嘅地方;之後 sub_id 全部塞喺度
"history": [], # L4 用;等於記住曾經入過邊啲貨
}
return True
def add_sub(self, timestamp, container_id, sub_id, data): # 將一件細 item 放入指定容器
if container_id not in self.containers: # 連大容器都未起好
return False
container = self.containers[container_id] # 先攞出目標容器;之後所有判斷都圍住佢做
for cid, c in self.containers.items(): # 行晒全世界所有容器
if sub_id in c["subs"]: # 如果其他地方已經收咗同名 sub
return False # 全局撞名;唔畀一件貨同時喺兩個櫃桶出現
if len(container["subs"]) >= container["capacity"]: # 呢個容器已經裝滿
return False # 再塞落去就爆倉
container["subs"][sub_id] = {"data": data, "expiry": None} # 正式將貨放入內層小格;L1 先當永不過期
container["history"].append(sub_id) # 記低「呢件貨曾經入過呢個容器」
return True
def remove_sub(self, timestamp, container_id, sub_id): # 將指定細 item 由容器入面拎走
if container_id not in self.containers: # 連個櫃桶都冇
return False
if sub_id not in self.containers[container_id]["subs"]: # 目標櫃桶有,但入面搵唔到呢件貨
return False
del self.containers[container_id]["subs"][sub_id] # 真正將內層細格拆走
return True
def get_count(self, timestamp, container_id): # 查呢個容器而家裝住幾多件貨
if container_id not in self.containers: # 大容器唔存在
return None
return len(self.containers[container_id]["subs"]) # 直接數 inner dict 有幾個 key
def get_sub(self, timestamp, container_id, sub_id): # CRUD variant:直接睇某件 sub-item
if container_id not in self.containers: # outer 都未有
return None
info = self.containers[container_id]["subs"].get(sub_id) # inner lookup
return None if info is None else dict(info) # copy 一份返出去
def set_implicit(self, timestamp, container_id, sub_id, data): # hidden-create variant:冇 create_container,直接自動開 outer
if container_id not in self.containers: # outer 唔存在就即場補開
self.containers[container_id] = {"capacity": 999999, "subs": {}, "history": []}
if sub_id in self.containers[container_id]["subs"]: # 同一個 outer 入面撞名
return False
self.containers[container_id]["subs"][sub_id] = {"data": data, "expiry": None} # 直接塞入 inner dict
self.containers[container_id]["history"].append(sub_id) # 歷史都要記
return True
container = self.containers.get(container_id) # Step 1:先搵 outer container
subs = None if container is None else container["subs"] # Step 2:再落到 inner subs
if need_implicit_create and container is None: open_outer() # Step 3:隱藏 create 題就先補 outer
if duplicate_or_full: return False # Step 4:duplicate / capacity / missing check
subs[sub_id] = {"data": data} # Step 5:真正 CRUD 都係打 inner dict
def __init__(self):
self.containers = {}
self.containers = {
"lot1": {
"capacity": 2,
"subs": {
"car1": {"data": 100},
},
},
"lot2": {
"capacity": 3,
"subs": {},
},
}
只係:
self.containers[container_id]["subs"][sub_id]
L1 未正式用:
- self.backups
- self.merged_containers
- self.locks
# 明牌 container 版
create_container(timestamp, container_id, capacity)
add_sub(timestamp, container_id, sub_id, data)
remove_sub(timestamp, container_id, sub_id)
get_count(timestamp, container_id)
get_sub(timestamp, container_id, sub_id)
# 隱藏 create 版(outer dict 自動開)
set(key, field, value)
add_record(domain, record_type, value)
park_car(lot_id, car_id)
# 同 family 常見改名:
create_channel / inject_signal / drop_signal / read_channel
reserve / release / get_remaining
add_ingredient / remove_ingredient / get_recipe
# 有啲題 L1 已經有雙 container op:
move_sub(from_container, to_container, sub_id)
考點:
- outer dict 同 inner dict 邊個先開
- sub_id 要唔要全局唯一
- capacity / duplicate / missing check 點寫
Step 1:先認 outer key 同 inner key 係邊兩層
Step 2:決定係 explicit create_container,定 implicit 開 outer dict
Step 3:add_sub 前先 check outer 存在、sub 撞名、capacity 滿未
Step 4:remove / get / count 都係打 self.containers[container_id]["subs"]
Step 5:如果題目一開始已經有 move / copy,記住佢仍然係呢個 family
def find_sub(self, timestamp, sub_id): # 得一件細貨 id,反推佢而家收喺邊個大櫃桶
for cid, container in self.containers.items(): # 逐個大容器打開睇一次
if sub_id in container["subs"]: # 呢件貨喺呢個容器入面
return cid # 即刻回報地址;唔使再巡其他倉
return None # 行晒全場都冇見過
def top_n(self, timestamp, n): # 排頭 n 個最滿 / 最熱鬧嘅容器
items = [] # 臨時排行榜
for cid, container in self.containers.items(): # 巡每個容器
count = len(container["subs"]) # 用內層 sub 數量做 metric
items.append((-count, cid)) # 負號 = 想最多件排先;同數再按 id
items.sort() # count desc + cid asc
result = [] # 正式榜單
for value, cid in items[:n]: # 只拎頭 n 名
result.append(cid)
return result
def scan(self, timestamp, container_id): # scan variant:列出某個 outer 入面所有 inner key
if container_id not in self.containers: # outer 唔存在
return ""
keys = sorted(self.containers[container_id]["subs"].keys()) # inner keys 排序
return ",".join(keys) # generic 版用逗號串起
def scan_by_prefix(self, timestamp, container_id, prefix): # prefix variant:只揀某個 prefix 開頭嗰堆 sub
if container_id not in self.containers:
return ""
keys = [sub_id for sub_id in self.containers[container_id]["subs"] if sub_id.startswith(prefix)] # 先 filter
keys.sort() # 再 sort
return ",".join(keys)
if only_sub_id: for cid, container in self.containers.items(): ... # Step 1:reverse lookup 題就 for loop outer
if have_container_id: read_inner_dict_directly # Step 2:scan 題就直接打該 outer
rows.append((-metric, cid)) # Step 3:top-N 題先 collect tuple
rows.sort() # Step 4:sort 之後先 cut 頭 n 個
return format_after_sort # Step 5:最後先 format output
def __init__(self):
self.containers = {}
self.containers = {
"lot1": {"capacity": 2, "subs": {"car1": {"data": 100}, "car2": {"data": 50}}},
"lot2": {"capacity": 3, "subs": {"car3": {"data": 70}}},
}
find_sub:
for loop 全部 outer containers
top_n:
用 len(container["subs"]) 做 metric
即係都仲只係讀 self.containers
未去到 backup / lock
find_sub(timestamp, sub_id)
top_n(timestamp, n)
scan(timestamp, container_id)
scan_by_prefix(timestamp, container_id, prefix)
list_containers(timestamp)
get_reverse_lookup(timestamp, sub_id)
# 常見真題名:
find_car / find_signal / get_tenant_zones
scan / scan_by_prefix / list_domains
strongest_channels / get_fullest_lots
核心分兩款:
1. reverse lookup:得 sub_id,for loop 行 outer dict
2. count/sort/scan:針對 inner dict 做 collect + sort
考點:
- query 係針對 outer container,定係反查 inner sub
- scan / prefix / top-N 呢幾種變體點分
- 輸出係 list、string,定單一 id
Step 1:先睇參數係 container_id 定 sub_id
Step 2:如果得 sub_id,就 for loop 行晒 outer dict
Step 3:如果係 scan 某個 container,就 collect inner keys 再 sort
Step 4:如果係 top-N,就用 inner count / metric 砌 tuple
Step 5:最後先 format;唔好一早砌死字串
def _purge(self, timestamp): # L3 先有 helper;因為 inner sub 而家先開始有生死時間
for cid, container in self.containers.items(): # 逐個大容器巡一次
for sub_id in list(container["subs"].keys()): # 將細貨名單先抄出嚟;避免一邊行一邊刪爆 dict
exp = container["subs"][sub_id]["expiry"] # 攞呢件貨嘅 expiry
if exp is not None and timestamp >= exp: # 到鐘就代表呢件貨已經腐爛 / 過期 / 失效
del container["subs"][sub_id] # 真刪 inner entry;等於將過期貨直接由貨架撤走
def add_sub_with_ttl(self, timestamp, container_id, sub_id, data, ttl_ms): # 放貨入櫃時順手貼埋到期標籤
self._purge(timestamp) # 開工前先巡一次,清走舊垃圾
if container_id not in self.containers: # 目標容器都未起好
return False
container = self.containers[container_id]
for cid, c in self.containers.items(): # 全場再巡一次,睇下有冇同名貨已經擺喺其他櫃
if sub_id in c["subs"]:
return False
if len(container["subs"]) >= container["capacity"]: # 呢個櫃桶已經滿晒
return False
expiry = None # 預設永久貨;冇期限就一直放喺架
if ttl_ms is not None: # caller 真係有畀保鮮期
expiry = timestamp + ttl_ms # 換算成絕對死亡時間
container["subs"][sub_id] = {
"data": data, # 真正貨內容
"expiry": expiry, # 幾時過期;helper 之後靠呢格清貨
}
container["history"].append(sub_id) # 即使有 TTL,都記住呢件貨曾經來過
return True
def extend_sub(self, timestamp, container_id, sub_id, extra_ms): # TTL variant:延長 inner sub 壽命
self._purge(timestamp) # 先清走已死貨
if container_id not in self.containers: # outer 唔存在
return False
sub = self.containers[container_id]["subs"].get(sub_id) # inner lookup
if sub is None or sub["expiry"] is None: # 冇呢件貨,或者本身冇 TTL
return False
sub["expiry"] += extra_ms # 直接喺舊 expiry 上加
return True
def get_alive_sub(self, timestamp, container_id, sub_id): # inline-alive variant:讀嗰陣先 skip 死 record
if container_id not in self.containers:
return None
sub = self.containers[container_id]["subs"].get(sub_id)
if sub is None or (sub["expiry"] is not None and timestamp >= sub["expiry"]): # 即場 check 是否已死
return None
return dict(sub)
sub["expiry"] = timestamp + ttl_ms # Step 1:先將 expiry 放入 inner sub
self._purge(timestamp) # Step 2:lazy purge 題每個 public method 開頭先清
if inline_alive_check: skip_dead_subs # Step 3:另一條路係 read path 即場 skip 死 record
if extend: sub["expiry"] += extra_ms # Step 4:extend 係加喺舊 expiry
outer_stays_alive = True # Step 5:就算 inner 死晒,outer container 仍然喺度
def __init__(self):
self.containers = {}
self.containers = {
"lot1": {
"capacity": 2,
"subs": {
"car1": {"data": 100, "expiry": 9000},
"car2": {"data": 50, "expiry": None},
},
"history": ["car1", "car2"],
},
}
inner sub-item 多咗:
"expiry"
TTL 永遠係掛喺 inner sub 身上
唔係掛喺 outer container 身上
add_sub_with_ttl(timestamp, container_id, sub_id, data, ttl_ms)
set_with_ttl(key, field, value, ttl_ms)
extend(timestamp, container_id, sub_id, extra_ms)
renew(timestamp, container_id, sub_id)
# Family 2 L3 常見兩大路:
1. lazy purge 真刪:
_purge(timestamp)
if expiry 到咗 -> del inner entry
2. inline alive check:
_is_alive(fd, timestamp)
get/scan 嗰陣先跳過死 record
# 真題名:
park_car(..., ttl_ms)
inject_signal(..., ttl_ms)
set(..., ttl_ms)
add_record(..., ttl_ms)
考點:
- TTL 係掛喺 inner sub-item 身上
- 題目要 inline check 定 lazy purge
- 過期之後係 delete 定只係讀嗰陣跳過
Step 1:先將 expiry 放入 inner dict
Step 2:決定 helper 風格:_purge 定 _is_alive
Step 3:write path 要識計 expiry = timestamp + ttl_ms
Step 4:read / scan path 要加清貨或 alive check
Step 5:清楚 outer container 仲喺度,但 inner sub 可以逐件死
def backup(self, timestamp): # 幫每個大容器影相;但有 TTL,所以唔可以照抄 expiry
self._purge(timestamp) # 影相前先清走已死貨;避免相入面夾雜過期垃圾
snapshot = {}
for cid, container in self.containers.items(): # 每個大容器都各自影一格
snapshot[cid] = {
"capacity": container["capacity"], # 容器本身規格照抄
"history": list(container["history"]), # 歷史用新 list 抄一份;避免共用 reference
"subs": {}, # 內層貨之後逐件塞入 snapshot
}
for sub_id, info in container["subs"].items(): # 巡每件仍然活緊嘅貨
remaining = None # 預設永久貨冇倒數
if info["expiry"] is not None:
remaining = info["expiry"] - timestamp # 唔存絕對 expiry,只存當下仲有幾耐命
snapshot[cid]["subs"][sub_id] = {
"data": info["data"],
"remaining_ttl": remaining, # restore 嗰陣靠呢格重新計時
}
self.backups.append((timestamp, snapshot)) # 將影相時間同相片一齊收起
def restore(self, timestamp, target_ts): # 想返到 target_ts 當時個貨架狀態
best = None
for backup_ts, snapshot in self.backups: # 行晒所有舊相
if backup_ts <= target_ts: # 只揀 target_ts 或之前嘅相
best = (backup_ts, snapshot) # 一路覆蓋到最後,即最近嗰張合資格相
if best is None:
return False
self.containers = {} # 先清空現場;等陣成份場景重建
for cid, snap in best[1].items():
self.containers[cid] = {"capacity": snap["capacity"], "history": list(snap["history"]), "subs": {}}
for sub_id, info in snap["subs"].items(): # 將每件舊貨重新擺返入架
new_expiry = None
if info["remaining_ttl"] is not None:
new_expiry = timestamp + info["remaining_ttl"] # 用「而家」加返剩餘壽命,重新開始倒數
self.containers[cid]["subs"][sub_id] = {"data": info["data"], "expiry": new_expiry}
return True
def get_history(self, timestamp, container_id): # 查呢個容器曾經裝過邊啲貨
self._purge(timestamp) # 查之前先保持現場乾淨
if container_id not in self.containers:
return None
return list(self.containers[container_id]["history"]) # 回新 list;唔好直接交內部 reference 畀外面亂改
def move_sub(self, timestamp, from_container, to_container, sub_id): # pair-op variant:搬 inner item 去另一個 outer
self._purge(timestamp) # 先清場
source = self.containers.get(from_container) # source outer
target = self.containers.get(to_container) # target outer
if not source or not target or sub_id not in source["subs"]: # 任一 outer 唔啱,或者 source 冇貨
return False
if len(target["subs"]) >= target["capacity"]: # target 滿晒
return False
target["subs"][sub_id] = source["subs"][sub_id] # 整份 data 搬過去
del source["subs"][sub_id] # source 清空
target["history"].append(sub_id) # target history 記低新來客
return True
def copy_sub(self, timestamp, source_container, target_container, sub_id): # copy variant:source 保留,target 新建
source = self.containers.get(source_container)
target = self.containers.get(target_container)
if not source or not target or sub_id not in source["subs"]:
return False
target["subs"][sub_id] = copy.deepcopy(source["subs"][sub_id]) # 深拷貝一份過去
target["history"].append(sub_id)
return True
snapshot[cid]["subs"][sub_id]["remaining_ttl"] = remaining # Step 1:backup TTL 轉 remaining_ttl
new_expiry = timestamp + remaining_ttl # Step 2:restore 用 restore 時刻重算 expiry
history = self.containers[cid]["history"] # Step 3:history 跟 outer container 走
source_pop_then_target_set = move_sub() # Step 4:move = target 收、source 清
source_keep_then_target_clone = copy_sub() # Step 5:copy = source 保留、target 深拷貝
def __init__(self):
self.containers = {}
self.backups = []
self.merged_containers = {}
self.containers = {
"lot1": {
"capacity": 2,
"subs": {
"car1": {"data": 100, "expiry": 9000},
},
"history": ["car1", "car2"],
},
}
self.backups = [
(5000, {
"lot1": {
"capacity": 2,
"history": ["car1", "car2"],
"subs": {
"car1": {"data": 100, "remaining_ttl": 4000},
},
},
}),
]
self.merged_containers = {
"old_lot": "lot1",
}
backup / restore 用:
self.backups
history 用:
self.containers[container_id]["history"]
如果題目有 merge_container:
self.merged_containers[from_id] = to_id
呢級唔只係 outer + inner dict
而係 containers + backups + history + merged_containers 一齊睇
backup(timestamp)
restore(timestamp, target_ts)
get_history(timestamp, container_id)
move_sub(timestamp, from_container, to_container, sub_id)
upgrade_room(timestamp, from_room_id, to_room_id)
copy_file(timestamp, source, dest)
merge_container(timestamp, from_id, to_id) # 少見,但會有
你要識分:
- backup/restore = remaining_ttl 款
- history = container history / inner event list
- move/upgrade = 搬 data + 清 source
- copy = source 唔變,dest 覆寫/新建
考點:
- nested snapshot 應該點存
- remaining_ttl 要計喺 inner item 嗰層
- move / copy / upgrade 其實改緊邊兩個 outer keys
Step 1:先影 outer container 殼,再逐件 inner sub 存 remaining_ttl
Step 2:restore 時 outer 先重建,inner 再逐件塞返入去
Step 3:history 題睇係記 container 歷史,定記 sub 事件
Step 4:move / upgrade = source 清空;copy = source 保留
Step 5:如果有 merge_container,記住佢係 outer 層搬倉題
async def batch(self, timestamp, operations): # 一次過處理一堆對容器落手嘅指令
self._purge(timestamp) # batch 前先清場;唔好拎住死貨去搬倉
async def execute_op(op): # 每張單各自走呢條 async 線
if op["type"] == "move": # 呢款同時掂兩個大容器
keys = sorted([op["from_container"], op["to_container"]]) # 鎖順序統一化;避免兩條線互相卡死
async with self.locks[keys[0]]: # 先鎖細嗰個 id
async with self.locks[keys[1]]: # 再鎖大嗰個;兩邊都封住先好搬貨
from_c = self.containers.get(op["from_container"]) # source 倉
to_c = self.containers.get(op["to_container"]) # target 倉
if not from_c or not to_c: # 任一個倉唔存在
return False
sub_id = op["sub_id"]
if sub_id not in from_c["subs"]: # source 倉根本冇呢件貨
return False
if len(to_c["subs"]) >= to_c["capacity"]: # target 倉滿晒,搬唔入
return False
to_c["subs"][sub_id] = from_c["subs"][sub_id] # 將件貨整份搬去新倉
del from_c["subs"][sub_id] # 舊倉嗰格清空;避免一貨兩地
return True
cid = op["container_id"] # 其餘單容器 op 都只掂一個 outer key
async with self.locks[cid]: # 鎖住該容器;避免同時有人加減貨
if op["type"] == "add":
return self.add_sub(timestamp, cid, op["sub_id"], op["data"])
if op["type"] == "remove":
return self.remove_sub(timestamp, cid, op["sub_id"])
return None
tasks = [execute_op(op) for op in operations] # 開晒工人線
results = await asyncio.gather(*tasks) # 等所有搬貨 / 加貨 / 減貨單做完
return list(results)
async def batch_scan(self, timestamp, container_ids): # read-only batch variant:逐個 outer scan
async def do_one(container_id):
async with self.locks[container_id]: # read 題有時一樣跟 outer lock
return self.get_count(timestamp, container_id)
return list(await asyncio.gather(*(do_one(cid) for cid in container_ids)))
async def batch_resolve(self, timestamp, requests): # update-only variant:全部都係單 container op
async def do_one(req):
async with self.locks[req["container_id"]]: # 一條 request 一把 outer lock
return self.remove_sub(timestamp, req["container_id"], req["sub_id"])
return list(await asyncio.gather(*(do_one(req) for req in requests)))
scope = one_container_or_two_containers(op) # Step 1:先判斷 shared outer scope
lock = self.locks[container_id] # Step 2:Pattern A 跟 outer key lock
keys = sorted([from_container, to_container]) # Step 3:Pattern B 一律 sorted pair-lock
call_old_sync_logic_inside_lock() # Step 4:lock 入面再 call add/remove/move
return list(await asyncio.gather(*tasks)) # Step 5:最後 gather 成批結果
def __init__(self):
self.containers = {}
self.backups = []
self.merged_containers = {}
self.locks = defaultdict(asyncio.Lock)
self.containers = {
"lot1": {"capacity": 2, "subs": {"car1": {"data": 100, "expiry": None}}, "history": ["car1"]},
"lot2": {"capacity": 2, "subs": {}, "history": []},
}
self.locks["lot1"] = <Lock>
self.locks["lot2"] = <Lock>
資料本身:
self.containers
lock state:
self.locks[container_id]
Pattern A:
一張 op 只改一個 container
Pattern B:
move / copy / upgrade 同時改兩個 outer keys
→ sorted pair-lock
batch(timestamp, operations)
batch_operations(timestamp, ops)
batch_resolve(timestamp, requests)
batch_ops(timestamp, ops)
# 單 container op
add / remove / update / resolve / set / delete
# 雙 container / 雙 key op
move(from_container, to_container, sub_id)
upgrade(from_room_id, to_room_id)
copy(source, dest)
Pattern A:
改一個 outer item -> lock[container_id]
Pattern B:
同時改兩個 outer item -> sorted pair-lock
考點:
- lock 係跟 outer container,唔係跟 inner sub
- move / copy / upgrade 幾時要 pair-lock
- 舊 sync CRUD 點包成 async batch
Step 1:先寫 execute_op(op)
Step 2:睇張 op 係只改一個 container,定同時改兩個
Step 3:單 container 就 lock[container_id]
Step 4:雙 container 就 sorted([a, b]) 後 pair-lock
Step 5:lock 入面再 call add/remove/move 舊邏輯,最後 gather
async def sync(self, timestamp, container_ids, max_concurrent): # 將多個大容器逐個同步去外部世界
self._purge(timestamp) # 出發前先清場;唔同步死貨
sem = asyncio.Semaphore(max_concurrent) # 閘機:同一時間最多放 N 個容器出去
async def do_one(cid): # 每個容器一條排隊線
if cid not in self.containers: # 呢個容器名根本唔存在
return False # fail-fast:門口即彈走
async with sem: # 真正輪到佢先佔用外部資源
await asyncio.sleep(0.01) # 模擬出網 / 出 disk / 出 replication call
return True
tasks = [do_one(cid) for cid in container_ids] # 所有容器一齊排隊
results = await asyncio.gather(*tasks) # 等全部同步結果返嚟
return list(results)
async def batch_scan_all_sleep(self, timestamp, container_ids, max_concurrent): # all-sleep variant:無論存在與否都照樣 attempt
sem = asyncio.Semaphore(max_concurrent) # 閘口一樣限流
async def do_one(container_id):
async with sem: # 先入 sem
await asyncio.sleep(0.01) # 先 sleep,模擬一定要打外部 call
if container_id not in self.containers: # sleep 完先知 / 先回失敗
return False
return True
return list(await asyncio.gather(*(do_one(cid) for cid in container_ids)))
async def propagate(self, timestamp, domain_ids, max_concurrent): # fail-fast variant:domain 唔存在就即走
sem = asyncio.Semaphore(max_concurrent)
async def do_one(domain_id):
if domain_id not in self.containers: # generic 版用 containers 代 domain registry
return False
async with sem:
await asyncio.sleep(0.01)
return True
return list(await asyncio.gather(*(do_one(did) for did in domain_ids)))
is_all_sleep = read_spec_first() # Step 1:先判 fail-fast 定 all-sleep
sem = asyncio.Semaphore(max_concurrent) # Step 2:function 入面開 sem
if fail_fast and invalid: return False # Step 3:fail-fast 版 check 先
async with sem: await asyncio.sleep(0.01) # Step 4:真正 external cost 喺 sem 入面
return gathered_results # Step 5:最後逐條 input 對返逐條 output
def __init__(self):
self.containers = {}
self.backups = []
self.merged_containers = {}
self.locks = defaultdict(asyncio.Lock)
self.containers = {
"lot1": {"capacity": 2, "subs": {"car1": {"data": 100, "expiry": None}}, "history": ["car1"]},
"lot2": {"capacity": 3, "subs": {}, "history": []},
}
# sem 唔入 __init__
# sem = asyncio.Semaphore(max_concurrent) 係 sync()/batch_scan() 入面臨時開
持久 state:
- self.containers
- (如果題目要)self.locks
臨時 runtime state:
- sem
fail-fast 版:
invalid container 直接 False
all-sleep 版:
invalid container 都要入 sem / sleep 完先回
sync(timestamp, container_ids, max_concurrent)
propagate(timestamp, domain_ids, max_concurrent)
sync_files(timestamp, paths, max_concurrent)
batch_scan(keys, max_concurrent)
Family 2 L6 最常見兩款:
1. fail-fast
key/path/domain 唔存在 -> 直接 return
2. all-sleep
全部都入 sem
sleep 完先 check / 回傳
真題名:
batch_scan / sync_lots / sync_channels / propagate / sync_files
考點:
- 呢條係 fail-fast,定 all-sleep
- sem 前可唔可以先判 invalid key
- 最終回 list / dict / tuple 邊種結果
Step 1:先讀 spec,判斷 invalid key 要唔要照樣 sleep
Step 2:function 入面開 sem
Step 3:寫 do_one(...);fail-fast 就 check 先,all-sleep 就入 sem 先
Step 4:sleep 完再回 True/False 或 (key, value)
Step 5:gather 之後按 spec 整返 list 或 dict
呢版就係 Generic Mock 入面個 GenericF3 拆成真 mock page。Family 3 = 兩個 flat dict,兩種唔同嘅嘢。
Family 3 核心:
self.groups = {} # A 類嘢 / config
self.items = {} # B 類嘢 / instance
典型 signature:
- create_group(group_id, config)
- create_item(group_id, item_id, other_id, data)
- clear_item(item_id)
- get_active_count(other_id)
適用題型:
- Workflow:workflow + step status
- PackageMgr:package + installation record
- OrderBook:book side + order
self.groups = {
"g1": {"config": "gold"},
}
self.items = {
"i1": {
"group_id": "g1",
"other_id": "user1",
"data": 40,
"status": "ACTIVE",
"expiry": None,
},
}
import copy
import asyncio
from collections import defaultdict
class GenericF3:
def __init__(self):
self.groups = {} # group_id -> config / capacity / meta
self.items = {} # item_id -> {group_id, other_id, data, status, expiry}
self.backups = [] # L4 snapshots
self.merged_groups = {} # L4 old_group -> new_group
self.history = defaultdict(list) # L4 history per group
self.locks = defaultdict(asyncio.Lock) # L5 per-item / per-other lock
def __init__(self):
self.groups = {}
self.items = {}
self.backups = []
self.merged_groups = {}
self.history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock)
self.groups = {
"g1": {"config": "gold"},
}
self.items = {
"i1": {"group_id": "g1", "other_id": "user1", "data": 40, "status": "ACTIVE", "expiry": None},
}
self.backups = []
self.merged_groups = {}
self.history = {"g1": ["i1"]}
self.locks = defaultdict(asyncio.Lock)
L1:self.groups + self.items
L3:item 多 "expiry" 同 status transition
L4:self.backups + self.history + self.merged_groups
L5:self.locks
L6:sem 一樣係 function 入面臨時開
def create_group(self, timestamp, group_id, config): # 先開一種「大類別 / policy / workflow 容器」
if group_id in self.groups: # 同名 group 已經存在
return False
self.groups[group_id] = {"config": config} # group 通常只存規格、上限、設定值呢類靜態資料
return True
def create_item(self, timestamp, group_id, item_id, other_id, data): # 再喺呢個 group 下面開一個真實 instance
if group_id not in self.groups: # 父 group 都未起好;即係冇 policy / 冇 workflow / 冇 band
return False
if item_id in self.items: # 呢個 item 自己個 id 已經撞咗
return False
self.items[item_id] = {
"group_id": group_id, # 呢件 item 屬於邊個 group;之後 merge / history 都靠呢格
"other_id": other_id, # 第二條關係線;可能係 entity_id / operator_id / user_id
"data": data, # 真 payload;可能係 severity / amount / step data
"status": "ACTIVE", # 一出世先當活緊;之後先可能變 CLEARED / EXPIRED / REPORTED
"expiry": None, # L1 先冇 TTL
}
self.history[group_id].append(item_id) # 先記住呢個 group 曾經收過呢個 item
return True
def clear_item(self, timestamp, item_id): # 將一件 item 由 active 狀態標記為清除咗
if item_id not in self.items: # 連件貨都冇
return False
self.items[item_id]["status"] = "CLEARED" # 唔 delete;因為之後仲可能要計 history / report / audit
return True
def get_active_count(self, timestamp, other_id): # 數某個 other_id 旗下仲有幾多件 ACTIVE item
count = 0
for item_id, item in self.items.items(): # 要行晒 item dict;因為 items 係 flat,唔係按 other_id 分桶
if item["other_id"] == other_id and item["status"] == "ACTIVE":
count += 1 # 呢件既屬於對方,又仲活緊,先計入總數
return count
def get_item(self, timestamp, item_id): # item-only variant:直接睇一張 B 卡
item = self.items.get(item_id) # flat item lookup
return None if item is None else dict(item) # copy 一份返出去
def delete_group(self, timestamp, group_id): # group variant:刪 group 前通常要先 check 底下有冇 item
if group_id not in self.groups: # A dict 冇呢個 group
return False
for item in self.items.values(): # 巡全部 item
if item["group_id"] == group_id and item["status"] == "ACTIVE": # 仲有 live item 掛住喺度
return False
del self.groups[group_id] # 真正刪 group
return True
group = self.groups.get(group_id) # Step 1:先分清 A = groups,B = items
item = self.items.get(item_id) # Step 2:item-only method 直接打 self.items[item_id]
self.items[item_id] = {"group_id": group_id, ...} # Step 3:create_item 時記齊兩條關係線
if item["status"] == "ACTIVE": count += 1 # Step 4:count 題多數要睇 status
if deleting_group: scan_all_items_first # Step 5:group 題成日要先巡 items 睇會唔會撞規則
def __init__(self):
self.groups = {}
self.items = {}
self.groups = {
"policy1": {"config": "strict"},
}
self.items = {
"violation1": {
"group_id": "policy1",
"other_id": "alice",
"data": 4,
"status": "ACTIVE",
},
}
兩份 dict 一齊用:
- self.groups
- self.items
item-only method 直接打:
self.items[item_id]
create_group(timestamp, group_id, config)
create_item(timestamp, group_id, item_id, other_id, data)
clear_item(timestamp, item_id)
get_active_count(timestamp, other_id)
get_item(timestamp, item_id)
delete_group(timestamp, group_id)
# 常見真題名:
register_policy / flag_violation / clear_violation
create_queue / submit_report / resolve_report
register_band / lease / revoke
特徵:
- 有個 method 只食 item_id
- 但 item_id 唔係 groups 個 key
考點:
- groups 同 items 兩個 dict 點分工
- item 入面要記返邊兩條關係線
- 一個 item-only method 點樣寫
Step 1:先決定 group 存 config,item 存真正 instance
Step 2:create_group 同 create_item 分開寫
Step 3:create_item 時記低 group_id + other_id + data + status
Step 4:clear / get 之類 item-only method 直接打 self.items[item_id]
Step 5:count 題通常要 for loop items,因為唔係按 other_id 分桶
def find_item(self, timestamp, item_id): # 由 item_id 反查佢掛喺邊個 group,同埋同邊個 other_id 有關
if item_id not in self.items: # 目標 item 根本唔喺主表
return None
item = self.items[item_id] # 攞返成張 item 卡
return (item["group_id"], item["other_id"]) # generic 版回最重要兩條關係線
def top_n(self, timestamp, n): # 排頭 n 個最「多 active item」嘅 other_id
counts = {} # other_id -> active item count
for item_id, item in self.items.items(): # 巡全部 instance
if item["status"] == "ACTIVE": # 死咗 / 清咗嗰啲唔好計
oid = item["other_id"]
if oid not in counts: # 第一次見到呢個 other_id
counts[oid] = 0
counts[oid] += 1 # 多一件 active item 就加一
items = []
for oid, count in counts.items(): # 將統計結果變成可排序 tuple
items.append((-count, oid)) # 多嘅排先;同數用 other_id 作 tie-break
items.sort()
result = []
for value, oid in items[:n]: # 拎榜首 n 個
result.append(oid)
return result
def get_group_items(self, timestamp, group_id): # group lookup variant:列出某個 group 底下所有 item
result = [] # 結果 list
for item_id, item in self.items.items(): # 巡 items[B]
if item["group_id"] == group_id: # 屬於呢個 group 就收
result.append(item_id)
result.sort() # 排序後交返
return result
def get_items_for_other(self, timestamp, other_id): # other lookup variant:列出某個 owner / entity 底下所有 item
result = [] # 結果 list
for item_id, item in self.items.items(): # 一樣係巡 items[B]
if item["other_id"] == other_id and item["status"] == "ACTIVE": # 只要 live 嗰堆
result.append(item_id)
result.sort()
return result
if by_item_id: read_self_items_directly # Step 1:item_id 題直接打 self.items
if by_group_or_other: for item_id, item in self.items.items(): ... # Step 2:group / other stats 題就巡 B dict
if only_live_items: filter_status_active # Step 3:好多題只計 ACTIVE
rows.append((-count, oid)) # Step 4:排行題先 collect tuple
return formatted_after_sort # Step 5:最後先 format
def __init__(self):
self.groups = {}
self.items = {}
self.groups = {
"policy1": {"config": "strict"},
}
self.items = {
"violation1": {"group_id": "policy1", "other_id": "alice", "data": 4, "status": "ACTIVE"},
"violation2": {"group_id": "policy1", "other_id": "bob", "data": 2, "status": "ACTIVE"},
}
find_item:
直接查 self.items[item_id]
top_n / stats:
for loop self.items 去數 other_id / status / group_id
通常唔係排 self.groups
find_item(timestamp, item_id)
top_n(timestamp, n)
get_group_items(timestamp, group_id)
get_items_for_other(timestamp, other_id)
get_status(timestamp, group_id, item_id)
# 常見真題名:
find_violation / get_worst_entities
get_operator_bands / get_group_history
get_step_status / get_progress / list_workflows
核心:
- 用 item_id 直接查 item dict
- 或者 for loop items 去數 other_id / status / group_id
考點:
- query 係 item lookup、group lookup,定 other_id stats
- top-N 係統計邊個維度
- status 需唔需要參與 filter
Step 1:先睇參數係 item_id、group_id,定 other_id
Step 2:item_id 題直接查 self.items
Step 3:group / other stats 題就 for loop self.items 做聚合
Step 4:只計 ACTIVE / live 嗰堆 item
Step 5:sort 完先決定回 tuple、list,定格式化字串
def _purge(self, timestamp): # L3 先引入 helper;因為 item 而家先開始會過期
for item_id in list(self.items.keys()): # 巡所有 item
exp = self.items[item_id]["expiry"] # 攞 expiry
if exp is not None and timestamp >= exp: # 到期即代表呢件 item 已經死咗
self.items[item_id]["status"] = "EXPIRED" # 呢個 family 唔 delete,只改 status;方便之後 history / report 仲查得到
def create_item_with_ttl(self, timestamp, group_id, item_id, other_id, data, ttl_ms): # 開 item 時順手加壽命
self._purge(timestamp) # 做新動作前先將應該死嘅 item 先判死
if group_id not in self.groups or item_id in self.items: # 父 group 唔存在,或者 item 撞名,都唔畀開
return False
expiry = None # 預設永久 item
if ttl_ms is not None:
expiry = timestamp + ttl_ms # 將 ttl 轉成絕對時間炸彈
self.items[item_id] = {
"group_id": group_id,
"other_id": other_id,
"data": data,
"status": "ACTIVE", # 一開先當活緊;helper 到鐘先會改成 EXPIRED
"expiry": expiry,
}
self.history[group_id].append(item_id) # 呢個 group 嘅歷史清單都要記埋
return True
def set_status(self, timestamp, item_id, new_status): # state-machine variant:手動改狀態
self._purge(timestamp) # 先跑 helper
if item_id not in self.items: # item 唔存在
return False
self.items[item_id]["status"] = new_status # 直接改 status
return True
def complete_item(self, timestamp, item_id): # workflow-like variant:完成一件 item
self._purge(timestamp)
if item_id not in self.items or self.items[item_id]["status"] != "ACTIVE": # 唔存在或者唔係可完成狀態
return False
self.items[item_id]["status"] = "COMPLETED" # 完成後改新狀態
self.items[item_id]["expiry"] = None # 完成後通常唔再等 TTL
return True
self.items[item_id]["expiry"] = timestamp + ttl_ms # Step 1:item 入面加 expiry
self.items[item_id]["status"] = "ACTIVE" # Step 2:一開始狀態通常先係 ACTIVE / PENDING
if expired: self.items[item_id]["status"] = "EXPIRED" # Step 3:F3 常見係 mark status,唔 delete
if state_machine: transition_to_next_status() # Step 4:workflow 題照樣係改 status
self._purge(timestamp) # Step 5:每個 public method 開頭先跑 helper
def __init__(self):
self.groups = {}
self.items = {}
self.groups = {
"policy1": {"config": "strict"},
}
self.items = {
"violation1": {
"group_id": "policy1",
"other_id": "alice",
"data": 4,
"status": "ACTIVE",
"expiry": 9000,
},
}
F3 最大分別:
過期後好多時唔 delete
而係改:
status = "EXPIRED"
所以 item 入面要同時有:
- status
- expiry
create_item_with_ttl(timestamp, group_id, item_id, other_id, data, ttl_ms)
expire(timestamp, item_id)
set_status(timestamp, item_id, new_status)
start_item / complete_item / fail_item
# Family 3 L3 常見兩條路:
1. TTL + status:
ACTIVE -> EXPIRED
2. state machine:
PENDING -> READY -> PROCESSING -> COMPLETED / FAILED
# 真題名:
flag_violation(..., ttl_ms)
lease(..., ttl_ms)
complete_step / fail_step / _process_triggers
考點:
- 到期之後係 delete,定改 status
- 呢題係 TTL 題,定 state-machine 題
- helper 應該幾時先引入
Step 1:先加 expiry / status 欄位
Step 2:寫 helper,決定到鐘係 EXPIRED 定 READY / FAILED 轉態
Step 3:每個 public method 開頭先 call helper
Step 4:create_item_with_ttl 時記住 ACTIVE + expiry 一齊落
Step 5:如果係 workflow 款,要再加 trigger / transition 規矩
def backup(self, timestamp): # 呢個 family 要影 4 份 state;唔係淨係 items 咁簡單
self._purge(timestamp) # 影相前先判晒邊啲 item 已過期
snap_groups = copy.deepcopy(self.groups) # groups 係第一層大設定
snap_items = {} # items 要自己逐件轉成 backup-friendly 形狀
for item_id, info in self.items.items():
remaining = None # 預設永久 item 冇 remaining ttl
if info["expiry"] is not None:
remaining = info["expiry"] - timestamp # 有期限就轉成剩餘壽命
snap_items[item_id] = {
"group_id": info["group_id"],
"other_id": info["other_id"],
"data": info["data"],
"status": info["status"], # status 都要影埋;restore 後先知當時係 ACTIVE 定 CLEARED
"remaining_ttl": remaining,
}
snap_history = copy.deepcopy(dict(self.history)) # history 亦都係系統 state,一樣要凍結落相
self.backups.append((timestamp, snap_groups, snap_items, snap_history))
def restore(self, timestamp, target_ts): # 將 group / item / history 一次過倒帶
best = None
for entry in self.backups:
if entry[0] <= target_ts: # 只揀 target_ts 或之前嗰啲舊相
best = entry
if best is None:
return False
backup_ts, snap_groups, snap_items, snap_history = best
self.groups = copy.deepcopy(snap_groups) # 先還原 groups
self.items = {} # items 要逐件重建,因為要重算 expiry
for item_id, info in snap_items.items():
new_expiry = None
if info["remaining_ttl"] is not None:
new_expiry = timestamp + info["remaining_ttl"] # 由「而家」重新起錶
self.items[item_id] = {
"group_id": info["group_id"],
"other_id": info["other_id"],
"data": info["data"],
"status": info["status"],
"expiry": new_expiry,
}
self.history = defaultdict(list) # history 先重開一個乾淨 defaultdict
for k, v in snap_history.items():
self.history[k] = list(v) # 再將每條歷史帶抄返入去
return True
def get_history(self, timestamp, group_id): # 查呢個 group 底下曾經出現過咩 item
return list(self.history.get(group_id, [])) # 即使冇都回空 list copy;caller 好處理啲
def merge_group(self, timestamp, from_id, to_id): # 將一堆 item 由舊 group 集體搬倉去新 group
self._purge(timestamp) # 合併前先判死;免得搬埋過期屍體
if from_id not in self.groups or to_id not in self.groups or from_id == to_id:
return False
for item_id, item in self.items.items(): # 巡所有 item
if item["group_id"] == from_id: # 凡係舊 group 旗下嘅,都要改戶籍
item["group_id"] = to_id
self.merged_groups[from_id] = to_id # 記低舊 group 去咗邊
del self.groups[from_id] # 舊 group 本體正式除名
return True
def transfer_item(self, timestamp, item_id, new_other_id): # reassign variant:唔改 group,只改另一條關係線
if item_id not in self.items: # item 唔存在
return False
self.items[item_id]["other_id"] = new_other_id # 直接改 owner / entity / operator
return True
def rollback_item(self, timestamp, item_id): # rollback variant:將 completed / failed 退返去 pending
if item_id not in self.items: # item 唔存在
return False
if self.items[item_id]["status"] not in {"COMPLETED", "FAILED"}: # 唔係可 rollback 狀態
return False
self.items[item_id]["status"] = "PENDING" # 退返去待處理
return True
snapshot = (groups, items, history) # Step 1:先列清楚 L4 要凍結邊幾份 state
remaining_ttl = expiry - timestamp # Step 2:item 有 TTL 就轉 remaining_ttl
self.merged_groups[from_id] = to_id # Step 3:merge 一定記 mapping
self.items[item_id]["group_id"] = to_id # Step 4:merge 改 group_id
self.items[item_id]["other_id"] = new_other_id # Step 5:transfer/reassign 改 other_id
def __init__(self):
self.groups = {}
self.items = {}
self.backups = []
self.merged_groups = {}
self.history = defaultdict(list)
self.groups = {
"g1": {"config": "gold"},
}
self.items = {
"i1": {"group_id": "g1", "other_id": "user1", "data": 40, "status": "ACTIVE", "expiry": 9000},
"i2": {"group_id": "g1", "other_id": "user2", "data": 20, "status": "CLEARED", "expiry": None},
}
self.history = {
"g1": ["i1", "i2"],
}
self.backups = [
(5000,
{"g1": {"config": "gold"}},
{
"i1": {"group_id": "g1", "other_id": "user1", "data": 40, "status": "ACTIVE", "remaining_ttl": 4000},
"i2": {"group_id": "g1", "other_id": "user2", "data": 20, "status": "CLEARED", "remaining_ttl": None},
},
{"g1": ["i1", "i2"]}),
]
self.merged_groups = {
"old_group": "g1",
}
backup 用:
self.backups = [(ts, snap_groups, snap_items, snap_history)]
history 用:
self.history[group_id]
merge 用:
self.merged_groups[from_id] = to_id
merge_group 真正改:
self.items[item_id]["group_id"]
backup(timestamp)
restore(timestamp, target_ts)
get_history(timestamp, group_id)
merge_group(timestamp, from_id, to_id)
transfer_item(timestamp, item_id, new_other_id)
fail_step(timestamp, group_id, step_id) # rollback 款
你要識分:
- history 可能掛 group_id
- merge 係改 item["group_id"]
- transfer/reassign 係改 item["other_id"]
- workflow 類仲會有 rollback COMPLETED -> PENDING
考點:
- groups / items / history 要唔要一齊 snapshot
- merge 改嘅係 group_id 定 other_id
- rollback 題其實係咪特殊 history / state 題
Step 1:先列清楚要備份邊幾份 state
Step 2:item 有 TTL 就轉 remaining_ttl
Step 3:restore 時 groups、items、history 逐份重建
Step 4:merge_group 要巡晒 items 改 group_id
Step 5:如果係 workflow rollback,再補 status 回退規則
async def batch(self, timestamp, operations): # 一次過處理好多 create / clear / transfer 類指令
self._purge(timestamp) # batch 前先將到鐘 item 判做 EXPIRED
async def execute_op(op): # 每張單自己跑一條 async 線
if op["type"] == "transfer": # 呢類係改擁有權 / 關係線;一次掂兩個 other_id
keys = sorted([op["from_other"], op["to_other"]]) # 一律排好先鎖;避免 A 等 B、B 等 A 卡死
async with self.locks[keys[0]]:
async with self.locks[keys[1]]:
item = self.items.get(op["item_id"]) # 攞返件真正要搬關係嘅 item
if not item:
return False
item["other_id"] = op["to_other"] # 將 item 由舊 owner 改掛去新 owner
return True
item_id = op.get("item_id", "") # 其餘 op 通常只掂一件 item
async with self.locks[item_id]: # 一件一把鎖;避免同時被 clear / update / create 衝撞
if op["type"] == "create":
return self.create_item(timestamp, op["group_id"], op["item_id"], op["other_id"], op["data"])
if op["type"] == "clear":
return self.clear_item(timestamp, op["item_id"])
return None
tasks = [execute_op(op) for op in operations] # 將所有單轉成 coroutine
results = await asyncio.gather(*tasks) # 一齊跑,再一齊收單
return list(results)
async def batch_resolve(self, timestamp, item_ids): # resolve-only batch variant:全部都係單 item op
async def do_one(item_id):
async with self.locks[item_id]: # item_id scope
return self.clear_item(timestamp, item_id)
return list(await asyncio.gather(*(do_one(item_id) for item_id in item_ids)))
async def batch_transfer(self, timestamp, transfers): # transfer-only variant:全部都係雙 other_id op
async def do_one(transfer):
keys = sorted([transfer["from_other"], transfer["to_other"]]) # pair-lock 順序統一
async with self.locks[keys[0]]:
async with self.locks[keys[1]]:
return self.transfer_item(timestamp, transfer["item_id"], transfer["to_other"])
return list(await asyncio.gather(*(do_one(t) for t in transfers)))
scope = item_or_other_relation(op) # Step 1:先判斷 shared state scope
if single_item_scope: async_with_lock(item_id) # Step 2:單 item op 用一把 lock
if pair_other_scope: keys = sorted([from_other, to_other]) # Step 3:雙 other relationship 用 sorted pair-lock
reuse_old_sync_method_inside_lock() # Step 4:lock 入面盡量 call 舊邏輯
return list(await asyncio.gather(*tasks)) # Step 5:最後 gather 回結果
def __init__(self):
self.groups = {}
self.items = {}
self.backups = []
self.merged_groups = {}
self.history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock)
self.items = {
"i1": {"group_id": "g1", "other_id": "user1", "data": 40, "status": "ACTIVE", "expiry": None},
}
self.locks["i1"] = <Lock>
self.locks["user1"] = <Lock>
self.locks["user2"] = <Lock>
資料本身:
self.items
lock scope 可能係:
- item_id
- other_id
- group_id
睇張 op 真係改邊份 shared state
再決定 lock 邊個 key
batch(timestamp, operations)
batch_audit(timestamp, ops)
batch_moderate(timestamp, ops)
batch_ops(timestamp, ops)
# 單 key op
create / clear / resolve / revoke / update_status
# 雙 key op
transfer(item_id, from_other, to_other)
escalate(from_queue, to_queue, report_id)
move lease/operator ownership
lock scope 可能係:
item_id / other_id / group_id
睇邊個 shared state 真係會俾張 op 改
考點:
- 呢張 op 改緊 item、group,定 other owner 關係
- pair-lock 應該鎖兩個 other_id 定兩個 group
- batch 係咪可以直接 reuse 舊 method
Step 1:先寫 execute_op(op)
Step 2:逐個 op 判斷 shared state scope
Step 3:單 scope 用一把 lock;雙 scope 就 sorted pair-lock
Step 4:lock 入面做 create / clear / transfer / escalate
Step 5:最後 gather 全部結果,保留原順序
async def report(self, timestamp, item_ids, max_concurrent): # 將一批 still-active item 推出去 report / send / sync
self._purge(timestamp) # 出發前先將過期嗰堆轉做 EXPIRED
sem = asyncio.Semaphore(max_concurrent) # 外部渠道只容納同時 N 個 request
async def do_one(item_id): # 每件 item 自己一條外發線
if item_id not in self.items: # 連 item 都唔存在
return False # fail-fast:未出門已經知道失敗
if self.items[item_id]["status"] != "ACTIVE": # 已 clear / expired / reported 嘅都唔應再送
return False
async with sem: # 真正輪到佢先佔用外部管道
await asyncio.sleep(0.01) # 模擬 external reporting call
self.items[item_id]["status"] = "REPORTED" # 成功送出後,喺本地即刻改 status 留底
return True
tasks = [do_one(item_id) for item_id in item_ids] # 每件 item 一齊排隊
results = await asyncio.gather(*tasks) # 等晒全部外發結果
return list(results)
async def send_decisions(self, timestamp, item_ids, max_concurrent): # decision variant:只送 RESOLVED 嗰堆
sem = asyncio.Semaphore(max_concurrent) # 外部閘口
async def do_one(item_id):
item = self.items.get(item_id) # flat item lookup
if item is None or item["status"] != "RESOLVED": # fail-fast:未 resolve 唔送得
return False
async with sem:
await asyncio.sleep(0.01) # 模擬 send decision
item["status"] = "SENT" # 成功後 mark SENT
return True
return list(await asyncio.gather(*(do_one(item_id) for item_id in item_ids)))
async def sync_group_items(self, timestamp, group_id, max_concurrent): # group-scoped variant:由某個 group 揀出 live items 去 sync
sem = asyncio.Semaphore(max_concurrent)
async def do_one(item_id):
item = self.items[item_id]
if item["group_id"] != group_id or item["status"] != "ACTIVE": # 唔屬於該 group 或唔 active
return False
async with sem:
await asyncio.sleep(0.01)
item["status"] = "SYNCED"
return True
ids = sorted(self.items.keys()) # 先有 deterministic 順序
return list(await asyncio.gather(*(do_one(item_id) for item_id in ids)))
sem = asyncio.Semaphore(max_concurrent) # Step 1:function 入面先開 sem
item = self.items.get(item_id) # Step 2:先做 item existence check
if item is None or item["status"] not in allowed_statuses: return False # Step 3:status fail-fast
async with sem: await asyncio.sleep(0.01) # Step 4:過關先做 external work
item["status"] = next_status # Step 5:成功後 mark REPORTED / SENT / SYNCED
def __init__(self):
self.groups = {}
self.items = {}
self.backups = []
self.merged_groups = {}
self.history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock)
self.items = {
"i1": {"group_id": "g1", "other_id": "user1", "data": 40, "status": "ACTIVE", "expiry": None},
"i2": {"group_id": "g1", "other_id": "user2", "data": 20, "status": "CLEARED", "expiry": None},
}
# sem 一樣唔入 __init__
# 係 report()/send_decisions()/sync_bands() 入面臨時開
持久 state:
- self.items
- (有需要先)self.locks
臨時 runtime state:
- sem
fail-fast check 常見睇:
- item 存唔存在
- status 係咪 ACTIVE / READY / RESOLVED
成功後常見 side effect:
- status = REPORTED / SENT / SYNCED
report(timestamp, item_ids, max_concurrent)
send_decisions(timestamp, report_ids, max_concurrent)
sync_bands(timestamp, band_ids, max_concurrent)
execute_steps(timestamp, step_ids, max_concurrent)
Family 3 L6 常見 check:
- item / report / lease 存唔存在
- status 係咪 ACTIVE / RESOLVED / READY
- 過關先入 sem
過關後常見 side effect:
- mark REPORTED
- mark SENT
- mark SYNCED
- 寫返 history / status transition
考點:
- existence + status fail-fast
- sem 外部限流同本地 status 更新點分工
- 成功後要 mark 咩新狀態
Step 1:function 入面開 sem
Step 2:do_one(item_id) 先 check item 存在同 status 合格
Step 3:過關先入 sem 做 external sleep / call
Step 4:成功後即刻 mark REPORTED / SENT / SYNCED
Step 5:gather 所有結果;如果 spec 要,順手記 history transition
InitFamily 題 B。Family 2:一個 channel 入面包好多 signal。主角係 self.channels[channel_id]["signals"][signal_id]。
import copy
import asyncio
from collections import defaultdict
class SignalProcessingPipeline:
def __init__(self):
self.channels = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
def __init__(self):
self.channels = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
self.channels = {
"c1": {
"max_signals": 5,
"signals": {
"s1": {"strength": 80, "created_at": 100, "expiry": 5000},
"s2": {"strength": 40, "created_at": 200, "expiry": None},
},
"history": ["s1", "s2", "s3"],
},
}
外層 channel_id
內層 signal_id
L3 先加 expiry
L4 先加 backups / history
L5 先加 locks
L6 冇新 field;sem 係 method 入面開
def create_channel(self, timestamp, channel_id, max_signals):
if channel_id in self.channels:
return False
self.channels[channel_id] = {
"max_signals": max_signals,
"signals": {},
"history": [],
}
return True
def inject_signal(self, timestamp, channel_id, signal_id, strength):
if channel_id not in self.channels:
return False
channel = self.channels[channel_id] # 拎住呢條 channel
if signal_id in channel["signals"]:
return False # 同名 signal 唔再插第二次
if len(channel["signals"]) >= channel["max_signals"]:
return False # 條 channel 已經滿 signal
channel["signals"][signal_id] = {
"strength": strength,
"created_at": timestamp,
"expiry": None,
}
channel["history"].append(signal_id) # 記低曾經入過嚟
return True
def drop_signal(self, timestamp, channel_id, signal_id):
if channel_id not in self.channels:
return False
channel = self.channels[channel_id]
if signal_id not in channel["signals"]:
return False
del channel["signals"][signal_id] # 由 channel 入面拎走呢粒 signal
return True
def read_channel(self, timestamp, channel_id):
if channel_id not in self.channels:
return None
return {sid: dict(sig) for sid, sig in self.channels[channel_id]["signals"].items()}
def __init__(self):
self.channels = {}
self.channels = {
"c1": {
"max_signals": 5,
"signals": {
"s1": {"strength": 80, "created_at": 100},
},
},
}
Step 1:create_channel 先起外層 container
Step 2:inject_signal 再放 sub-item 入 signals
Step 3:read_channel 只係睇入面個 signals dict
見到 signal_id 唔係外層 key
就知呢題係 Family 2
def strongest_channels(self, timestamp, n):
scored = []
for channel_id, channel in self.channels.items():
total = sum(sig["strength"] for sig in channel["signals"].values()) # 將成條 channel 嘅 signal 力量加埋
scored.append((total, channel_id))
scored.sort(key=lambda x: (-x[0], x[1]))
return [channel_id for total, channel_id in scored[:n]]
def find_signal(self, timestamp, signal_id):
for channel_id, channel in self.channels.items(): # 因為 signal_id 唔係外層 key,要行晒全部 channel 搵
if signal_id in channel["signals"]:
return channel_id
return None
self.channels = {
"c1": {
"signals": {
"s1": {"strength": 80},
"s2": {"strength": 40},
},
},
"c2": {
"signals": {
"s9": {"strength": 30},
},
},
}
strongest_channels 用:
sum(signal["strength"] for signal in channel["signals"].values())
find_signal 用:
for loop 全部 channel 去搵 signal_id
1. strongest_channels = 先 aggregate 再 sort
2. find_signal = for loop 行所有 channel
3. data structure 冇變,仍然係 channels / signals
Step 1:問自己 signal_id 係咪外層 key
Step 2:如果唔係,就 for loop 所有 channel
Step 3:要排行就先計 total strength,再 sort
def _purge_expired_signals(self, timestamp):
for channel in self.channels.values():
dead = []
for signal_id, signal in channel["signals"].items():
if signal["expiry"] is not None and timestamp >= signal["expiry"]:
dead.append(signal_id) # 到期 signal 先收集,唔好一路 loop 一路 del
for signal_id in dead:
del channel["signals"][signal_id]
def inject_signal_with_ttl(self, timestamp, channel_id, signal_id, strength, ttl_ms):
self._purge_expired_signals(timestamp) # 每個 public method 開頭先掃一次 expiry
if not self.inject_signal(timestamp, channel_id, signal_id, strength):
return False
self.channels[channel_id]["signals"][signal_id]["expiry"] = timestamp + ttl_ms
return True
def get_signal_age(self, timestamp, channel_id, signal_id):
self._purge_expired_signals(timestamp)
if channel_id not in self.channels:
return None
signal = self.channels[channel_id]["signals"].get(signal_id)
if signal is None:
return None
return timestamp - signal["created_at"]
self.channels = {
"c1": {
"signals": {
"s1": {"strength": 80, "created_at": 100, "expiry": 5100},
"s2": {"strength": 40, "created_at": 200, "expiry": None},
},
},
}
Step 1:sub-item 加 expiry
Step 2:每個 public method 開頭 purge expired subs
Step 3:成功 inject 之後再補 timestamp + ttl_ms
TTL 擺 signal 入面
唔係擺 channel 外面
def backup(self, timestamp):
self._purge_expired_signals(timestamp)
snapshot = {}
for channel_id, channel in self.channels.items():
snapshot[channel_id] = {
"max_signals": channel["max_signals"],
"signals": {},
"history": list(channel["history"]),
}
for signal_id, signal in channel["signals"].items():
remaining_ttl = None
if signal["expiry"] is not None:
remaining_ttl = signal["expiry"] - timestamp
snapshot[channel_id]["signals"][signal_id] = {
"strength": signal["strength"],
"created_at": signal["created_at"],
"remaining_ttl": remaining_ttl,
}
self.backups.append((timestamp, snapshot))
return str(len(snapshot))
def restore(self, timestamp, restore_timestamp):
self._purge_expired_signals(timestamp)
candidate = None
for ts, snapshot in self.backups:
if ts <= restore_timestamp:
candidate = snapshot
if candidate is None:
return False
self.channels = copy.deepcopy(candidate)
for channel in self.channels.values():
for signal in channel["signals"].values():
remaining_ttl = signal.pop("remaining_ttl")
signal["expiry"] = None if remaining_ttl is None else timestamp + remaining_ttl
return True
def get_signal_history(self, timestamp, channel_id):
self._purge_expired_signals(timestamp)
if channel_id not in self.channels:
return None
return list(self.channels[channel_id]["history"])
def __init__(self):
self.channels = {}
self.backups = []
self.channels = {
"c1": {
"max_signals": 5,
"signals": {
"s1": {"strength": 80, "created_at": 100, "expiry": 6200},
"s2": {"strength": 40, "created_at": 200, "expiry": None},
},
"history": ["s1", "s2", "s3"],
},
}
self.backups = [
(5000, {
"c1": {
"max_signals": 5,
"signals": {
"s1": {"strength": 80, "created_at": 100, "remaining_ttl": 1200},
"s2": {"strength": 40, "created_at": 200, "remaining_ttl": None},
},
"history": ["s1", "s2", "s3"],
},
}),
]
Step 1:backup 用 remaining_ttl 唔係原始 expiry
Step 2:restore 嗰陣用 restore 時刻重新計 expiry
Step 3:history 屬於 channel,就擺 channel 入面
async def batch_operations(self, timestamp, ops):
self._purge_expired_signals(timestamp)
async def execute_op(op):
if op["type"] in {"inject", "drop"}:
channel_id = op["channel_id"]
async with self.locks[channel_id]: # 單 channel op 鎖一把
if op["type"] == "inject":
return self.inject_signal(timestamp, channel_id, op["signal_id"], op["strength"])
return self.drop_signal(timestamp, channel_id, op["signal_id"])
first, second = sorted([op["from_channel_id"], op["to_channel_id"]]) # transfer_signal 會同時郁兩條 channel
async with self.locks[first]:
async with self.locks[second]:
from_channel = self.channels.get(op["from_channel_id"])
to_channel = self.channels.get(op["to_channel_id"])
if from_channel is None or to_channel is None:
return False
signal = from_channel["signals"].pop(op["signal_id"], None)
if signal is None:
return False
to_channel["signals"][op["signal_id"]] = signal
to_channel["history"].append(op["signal_id"])
return True
return list(await asyncio.gather(*(execute_op(op) for op in ops)))
def __init__(self):
self.channels = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
data = self.channels
concurrency control = self.locks
Pattern A:inject / drop 只郁一條 channel → 單 lock
Pattern B:transfer_signal 同時郁兩條 channel → sorted pair-lock
最後 gather 全部 op
async def sync_channels(self, timestamp, channel_ids, max_concurrent):
self._purge_expired_signals(timestamp)
sem = asyncio.Semaphore(max_concurrent)
async def do_one(channel_id):
if channel_id not in self.channels:
return False # channel 唔存在即刻 fail,唔好入 sem
async with sem:
await asyncio.sleep(0.01) # 模擬真 external sync
self.channels[channel_id]["history"].append(f"sync@{timestamp}")
return True
return list(await asyncio.gather(*(do_one(channel_id) for channel_id in channel_ids)))
持久 state 其實同 L5 一樣:
self.channels = {
"c1": {"history": ["s1", "s2", "sync@9000"]},
"c2": {"history": ["s9"]},
}
self.locks = defaultdict(asyncio.Lock)
runtime 另外開:
sem = asyncio.Semaphore(max_concurrent)
即係:
state 擺 self.channels
限流擺 function 入面嘅 sem
Step 1:function 入面開 sem
Step 2:missing channel 先 fail-fast
Step 3:過關先 sem + sleep
Step 4:成功後記 history / synced side effect
L6 冇新 __init__
只係 method flow 變得多咗
InitFamily 題 C。Family 2:一個 pool 入面有好多 job claim。主角係 self.pools[pool_id]["jobs"][job_id]。
import copy
import asyncio
from collections import defaultdict
class ResourceAllocationController:
def __init__(self):
self.pools = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
def __init__(self):
self.pools = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
self.pools = {
"pool1": {
"max_units": 10,
"jobs": {
"job1": {"units": 3, "claimed_at": 100, "expiry": 8000},
"job2": {"units": 5, "claimed_at": 200, "expiry": None},
},
"history": ["job1", "job2"],
},
}
pool_id 係外層
job_id 係內層
utilization 唔使另開 dict
現場用 jobs 加埋 units 就計到
def create_pool(self, timestamp, pool_id, max_units):
if pool_id in self.pools:
return False
self.pools[pool_id] = {"max_units": max_units, "jobs": {}, "history": []}
return True
def claim_unit(self, timestamp, pool_id, job_id, unit_count):
if pool_id not in self.pools:
return False
pool = self.pools[pool_id]
if job_id in pool["jobs"]:
return False
used = sum(job["units"] for job in pool["jobs"].values()) # 呢個 pool 已經俾咗幾多單位出去
if used + unit_count > pool["max_units"]:
return False
pool["jobs"][job_id] = {"units": unit_count, "claimed_at": timestamp, "expiry": None}
pool["history"].append(job_id)
return True
def release(self, timestamp, pool_id, job_id):
if pool_id not in self.pools:
return False
return self.pools[pool_id]["jobs"].pop(job_id, None) is not None
def get_utilization(self, timestamp, pool_id):
if pool_id not in self.pools:
return None
pool = self.pools[pool_id]
used = sum(job["units"] for job in pool["jobs"].values())
return used / pool["max_units"]
def __init__(self):
self.pools = {}
self.pools = {
"pool1": {
"max_units": 10,
"jobs": {
"job1": {"units": 3, "claimed_at": 100},
},
},
}
Step 1:create_pool 起 container
Step 2:claim_unit 放 job 入 jobs
Step 3:release 由 jobs 拎返出嚟
Step 4:utilization 即場計,唔好另外存
def get_job_claims(self, timestamp, job_id):
result = []
for pool_id, pool in self.pools.items(): # job_id 唔係外層 key,要行晒全部 pool
if job_id in pool["jobs"]:
result.append(pool_id)
return sorted(result)
def top_pools_by_utilization(self, timestamp, n):
rows = []
for pool_id in self.pools:
rows.append((self.get_utilization(timestamp, pool_id), pool_id))
rows.sort(key=lambda x: (-x[0], x[1]))
return [pool_id for _, pool_id in rows[:n]]
self.pools = {
"pool1": {
"max_units": 10,
"jobs": {
"job1": {"units": 3},
"job2": {"units": 5},
},
},
"pool2": {
"max_units": 8,
"jobs": {
"job1": {"units": 2},
},
},
}
get_job_claims(job1) → 要行晒 pool1 / pool2
top_pools_by_utilization → 用 jobs 入面 units 即場計比例
job_id 唔係 pool 外層 key
所以 get_job_claims 一定係 for loop all pools
排行前先計 utilization
sort key 多數係 desc utilization + asc id
def _purge_expired_claims(self, timestamp):
for pool in self.pools.values():
dead = []
for job_id, claim in pool["jobs"].items():
if claim["expiry"] is not None and timestamp >= claim["expiry"]:
dead.append(job_id)
for job_id in dead:
del pool["jobs"][job_id] # claim 到期就自動 release
def claim_unit_with_ttl(self, timestamp, pool_id, job_id, unit_count, ttl_ms):
self._purge_expired_claims(timestamp)
if not self.claim_unit(timestamp, pool_id, job_id, unit_count):
return False
self.pools[pool_id]["jobs"][job_id]["expiry"] = timestamp + ttl_ms
return True
"jobs": {
"job1": {"units": 3, "claimed_at": 100, "expiry": 8100},
"job2": {"units": 2, "claimed_at": 120, "expiry": None},
}
Step 1:TTL 仍然擺 sub-item 入面
Step 2:public method 開頭 purge expired claim
Step 3:claim 成功先補 expiry
def backup(self, timestamp):
self._purge_expired_claims(timestamp)
snapshot = {}
for pool_id, pool in self.pools.items():
snapshot[pool_id] = {"max_units": pool["max_units"], "jobs": {}, "history": list(pool["history"])}
for job_id, claim in pool["jobs"].items():
remaining_ttl = None if claim["expiry"] is None else claim["expiry"] - timestamp
snapshot[pool_id]["jobs"][job_id] = {
"units": claim["units"],
"claimed_at": claim["claimed_at"],
"remaining_ttl": remaining_ttl,
}
self.backups.append((timestamp, snapshot))
return str(len(snapshot))
def restore(self, timestamp, restore_timestamp):
self._purge_expired_claims(timestamp)
candidate = None
for ts, snapshot in self.backups:
if ts <= restore_timestamp:
candidate = snapshot
if candidate is None:
return False
self.pools = copy.deepcopy(candidate)
for pool in self.pools.values():
for claim in pool["jobs"].values():
remaining_ttl = claim.pop("remaining_ttl")
claim["expiry"] = None if remaining_ttl is None else timestamp + remaining_ttl
return True
def get_claim_history(self, timestamp, pool_id):
self._purge_expired_claims(timestamp)
if pool_id not in self.pools:
return None
return list(self.pools[pool_id]["history"])
def __init__(self):
self.pools = {}
self.backups = []
self.pools = {
"pool1": {
"max_units": 10,
"jobs": {
"job1": {"units": 3, "claimed_at": 100, "expiry": 6200},
"job2": {"units": 5, "claimed_at": 200, "expiry": None},
},
"history": ["job1", "job2", "job9"],
},
}
self.backups = [
(5000, {
"pool1": {
"max_units": 10,
"jobs": {
"job1": {"units": 3, "claimed_at": 100, "remaining_ttl": 1200},
"job2": {"units": 5, "claimed_at": 200, "remaining_ttl": None},
},
"history": ["job1", "job2", "job9"],
},
}),
]
Step 1:backup 將 expiry 轉成 remaining_ttl
Step 2:history 屬於 pool,就擺 pool 入面
Step 3:restore 再由 remaining_ttl 重算 expiry
async def batch_claims(self, timestamp, ops):
self._purge_expired_claims(timestamp)
async def execute_op(op):
if op["type"] in {"claim", "release"}:
pool_id = op["pool_id"]
async with self.locks[pool_id]:
if op["type"] == "claim":
return self.claim_unit(timestamp, pool_id, op["job_id"], op["unit_count"])
return self.release(timestamp, pool_id, op["job_id"])
first, second = sorted([op["from_pool_id"], op["to_pool_id"]])
async with self.locks[first]:
async with self.locks[second]:
source = self.pools.get(op["from_pool_id"])
target = self.pools.get(op["to_pool_id"])
if source is None or target is None:
return False
claim = source["jobs"].pop(op["job_id"], None)
if claim is None:
return False
target["jobs"][op["job_id"]] = claim
target["history"].append(op["job_id"])
return True
return list(await asyncio.gather(*(execute_op(op) for op in ops)))
self.pools = {
"pool1": {"jobs": {"job1": {"units": 3}}, "history": ["job1"]},
"pool2": {"jobs": {"job9": {"units": 2}}, "history": ["job9"]},
}
self.locks = {
"pool1": <asyncio.Lock>,
"pool2": <asyncio.Lock>,
}
Pattern A:
claim / release 只鎖一個 pool_id
Pattern B:
move claim 同時郁 source + target 兩個 pool
claim / release 只郁一個 pool → Pattern A
move claim 會同時郁兩個 pool → Pattern B sorted pair-lock
最後 gather 所有結果
async def sync_pools(self, timestamp, pool_ids, max_concurrent):
self._purge_expired_claims(timestamp)
sem = asyncio.Semaphore(max_concurrent)
async def do_one(pool_id):
if pool_id not in self.pools:
return False
async with sem:
await asyncio.sleep(0.01)
self.pools[pool_id]["history"].append(f"sync@{timestamp}")
return True
return list(await asyncio.gather(*(do_one(pool_id) for pool_id in pool_ids)))
持久 state:
self.pools = {
"pool1": {"history": ["job1", "sync@9000"]},
"pool2": {"history": ["job9"]},
}
臨時 runtime:
sem = asyncio.Semaphore(max_concurrent)
L6 冇新增 self.xxx field
只係 sync_pools() 裏面多咗限流閘口
Step 1:missing pool 先 fail-fast
Step 2:過關先 sem + sleep
Step 3:成功後記 sync side effect
InitFamily 題 D。Family 2:一個 zone 入面有好多 tenant reservation。呢題比一般 Family 2 多一個 merged_zones add-on。
import copy
import asyncio
from collections import defaultdict
class CapacityReservationBroker:
def __init__(self):
self.zones = {}
self.backups = []
self.merged_zones = {}
self.locks = defaultdict(asyncio.Lock)
self.zones = {
"z1": {
"total_slots": 10,
"tenants": {
"t1": {"slots": 3, "reserved_at": 100, "expiry": 8000},
"t2": {"slots": 5, "reserved_at": 200, "expiry": None},
},
"history": ["t1", "t2", "t3"],
},
}
self.merged_zones = {"old_zone": "new_zone"}
zone_id 係外層
tenant_id 係內層
merge 記錄唔擺 zone 入面
係獨立擺 self.merged_zones
def create_zone(self, timestamp, zone_id, total_slots):
if zone_id in self.zones:
return False
self.zones[zone_id] = {"total_slots": total_slots, "tenants": {}, "history": []}
return True
def reserve(self, timestamp, zone_id, tenant_id, slot_count):
if zone_id not in self.zones:
return False
zone = self.zones[zone_id]
if tenant_id in zone["tenants"]:
return False
used = sum(item["slots"] for item in zone["tenants"].values())
if used + slot_count > zone["total_slots"]:
return False
zone["tenants"][tenant_id] = {"slots": slot_count, "reserved_at": timestamp, "expiry": None}
zone["history"].append(tenant_id)
return True
def release(self, timestamp, zone_id, tenant_id):
if zone_id not in self.zones:
return False
return self.zones[zone_id]["tenants"].pop(tenant_id, None) is not None
def get_remaining(self, timestamp, zone_id):
if zone_id not in self.zones:
return None
zone = self.zones[zone_id]
used = sum(item["slots"] for item in zone["tenants"].values())
return zone["total_slots"] - used
self.zones = {
"z1": {
"total_slots": 10,
"tenants": {
"t1": {"slots": 3, "reserved_at": 100},
},
},
}
Step 1:外層起 zone
Step 2:內層 tenant reservation 放入 tenants
Step 3:remaining 係 total - used
def get_tenant_zones(self, timestamp, tenant_id):
result = []
for zone_id, zone in self.zones.items():
if tenant_id in zone["tenants"]:
result.append(zone_id)
return sorted(result)
def get_busiest_zones(self, timestamp, n):
rows = []
for zone_id, zone in self.zones.items():
used = sum(item["slots"] for item in zone["tenants"].values())
rows.append((used, zone_id))
rows.sort(key=lambda x: (-x[0], x[1]))
return [zone_id for _, zone_id in rows[:n]]
self.zones = {
"z1": {
"total_slots": 10,
"tenants": {
"alice": {"slots": 3},
"bob": {"slots": 5},
},
},
"z2": {
"total_slots": 8,
"tenants": {
"alice": {"slots": 2},
},
},
}
get_tenant_zones("alice") → 要掃 z1 / z2
get_busiest_zones → 計每個 zone 已用幾多 slots
tenant_id 唔係外層 key
所以 get_tenant_zones 要 for loop all zones
busiest_zones = aggregate 再排行
def _purge_expired_reservations(self, timestamp):
for zone in self.zones.values():
dead = []
for tenant_id, item in zone["tenants"].items():
if item["expiry"] is not None and timestamp >= item["expiry"]:
dead.append(tenant_id)
for tenant_id in dead:
del zone["tenants"][tenant_id]
def reserve_with_ttl(self, timestamp, zone_id, tenant_id, slot_count, ttl_ms):
self._purge_expired_reservations(timestamp)
if not self.reserve(timestamp, zone_id, tenant_id, slot_count):
return False
self.zones[zone_id]["tenants"][tenant_id]["expiry"] = timestamp + ttl_ms
return True
def extend(self, timestamp, zone_id, tenant_id, extra_ms):
self._purge_expired_reservations(timestamp)
tenant = self.zones.get(zone_id, {}).get("tenants", {}).get(tenant_id)
if tenant is None or tenant["expiry"] is None:
return False
tenant["expiry"] += extra_ms
return True
self.zones = {
"z1": {
"tenants": {
"alice": {"slots": 3, "reserved_at": 100, "expiry": 8100},
"bob": {"slots": 5, "reserved_at": 200, "expiry": None},
},
},
}
extend() 改嘅就係:
self.zones[zone_id]["tenants"][tenant_id]["expiry"]
Step 1:TTL 仍然跟 sub-item 走
Step 2:reserve 成功先設 expiry
Step 3:extend 係改現有 expiry,唔係由 timestamp 重算
def backup(self, timestamp):
self._purge_expired_reservations(timestamp)
snapshot = {}
for zone_id, zone in self.zones.items():
snapshot[zone_id] = {"total_slots": zone["total_slots"], "tenants": {}, "history": list(zone["history"])}
for tenant_id, item in zone["tenants"].items():
remaining_ttl = None if item["expiry"] is None else item["expiry"] - timestamp
snapshot[zone_id]["tenants"][tenant_id] = {
"slots": item["slots"],
"reserved_at": item["reserved_at"],
"remaining_ttl": remaining_ttl,
}
self.backups.append((timestamp, snapshot))
return str(len(snapshot))
def restore(self, timestamp, restore_timestamp):
self._purge_expired_reservations(timestamp)
candidate = None
for ts, snapshot in self.backups:
if ts <= restore_timestamp:
candidate = snapshot
if candidate is None:
return False
self.zones = copy.deepcopy(candidate)
for zone in self.zones.values():
for tenant in zone["tenants"].values():
remaining_ttl = tenant.pop("remaining_ttl")
tenant["expiry"] = None if remaining_ttl is None else timestamp + remaining_ttl
return True
def get_reservation_history(self, timestamp, zone_id):
self._purge_expired_reservations(timestamp)
if zone_id not in self.zones:
return None
return list(self.zones[zone_id]["history"])
def merge_zone(self, timestamp, from_zone_id, to_zone_id):
self._purge_expired_reservations(timestamp)
if from_zone_id not in self.zones or to_zone_id not in self.zones:
return False
source = self.zones[from_zone_id]
target = self.zones[to_zone_id]
for tenant_id, item in source["tenants"].items():
target["tenants"][tenant_id] = item
target["history"].append(tenant_id)
target["total_slots"] += source["total_slots"]
del self.zones[from_zone_id]
self.merged_zones[from_zone_id] = to_zone_id
return True
def __init__(self):
self.zones = {}
self.backups = []
self.merged_zones = {}
self.zones = {
"z1": {
"total_slots": 10,
"tenants": {
"alice": {"slots": 3, "reserved_at": 100, "expiry": 6200},
"bob": {"slots": 5, "reserved_at": 200, "expiry": None},
},
"history": ["alice", "bob", "cathy"],
},
"z2": {
"total_slots": 8,
"tenants": {},
"history": [],
},
}
self.backups = [
(5000, {
"z1": {
"total_slots": 10,
"tenants": {
"alice": {"slots": 3, "reserved_at": 100, "remaining_ttl": 1200},
"bob": {"slots": 5, "reserved_at": 200, "remaining_ttl": None},
},
"history": ["alice", "bob", "cathy"],
},
"z2": {
"total_slots": 8,
"tenants": {},
"history": [],
},
}),
]
self.merged_zones = {"old_zone": "new_zone"}
Step 1:history 留喺 zone 入面
Step 2:merge record 另開 merged_zones
Step 3:merge 真係會郁兩個 zone 嘅 data
async def batch_reservations(self, timestamp, ops):
self._purge_expired_reservations(timestamp)
async def execute_op(op):
if op["type"] in {"reserve", "release"}:
zone_id = op["zone_id"]
async with self.locks[zone_id]:
if op["type"] == "reserve":
return self.reserve(timestamp, zone_id, op["tenant_id"], op["slot_count"])
return self.release(timestamp, zone_id, op["tenant_id"])
first, second = sorted([op["from_zone_id"], op["to_zone_id"]])
async with self.locks[first]:
async with self.locks[second]:
return self.merge_zone(timestamp, op["from_zone_id"], op["to_zone_id"])
return list(await asyncio.gather(*(execute_op(op) for op in ops)))
self.zones = {
"z1": {"tenants": {"alice": {"slots": 3}}, "history": ["alice"]},
"z2": {"tenants": {"bob": {"slots": 2}}, "history": ["bob"]},
}
self.locks = {
"z1": <asyncio.Lock>,
"z2": <asyncio.Lock>,
}
self.merged_zones = {"old_zone": "z1"}
reserve / release 只鎖一個 zone
merge_zone 會同時用到兩個 zone lock
reserve / release = Pattern A
merge_zone = Pattern B,因為同時郁兩個 zone
pair-lock 前要 sorted()
async def sync_zones(self, timestamp, zone_ids, max_concurrent):
self._purge_expired_reservations(timestamp)
sem = asyncio.Semaphore(max_concurrent)
async def do_one(zone_id):
if zone_id not in self.zones:
return False
async with sem:
await asyncio.sleep(0.01)
self.zones[zone_id]["history"].append(f"sync@{timestamp}")
return True
return list(await asyncio.gather(*(do_one(zone_id) for zone_id in zone_ids)))
持久 state:
self.zones = {
"z1": {"history": ["alice", "sync@9000"]},
"z2": {"history": ["bob"]},
}
self.locks = defaultdict(asyncio.Lock)
臨時 runtime:
sem = asyncio.Semaphore(max_concurrent)
L6 新加嘅唔係 state
而係 sync_zones() 個 flow
missing zone → fail-fast
valid zone → sem + sleep
成功後記 history
InitFamily 題 E。Family 2:一張 recipe 入面有好多 ingredient。外層 recipe,內層 ingredient。
import copy
import asyncio
from collections import defaultdict
class RecipeManager:
def __init__(self):
self.recipes = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
self.recipes = {
"r1": {
"chef_id": "chef1",
"cook_time": 30,
"ingredients": {
"flour": {"qty": 200, "expiry": None},
"milk": {"qty": 100, "expiry": 5000},
},
"history": ["flour", "sugar", "milk"],
},
}
recipe_id 係外層
ingredient_name 係內層
chef_id 係 recipe metadata
唔係再開第 2 個 dict
def create_recipe(self, timestamp, recipe_id, chef_id, cook_time):
if recipe_id in self.recipes:
return False
self.recipes[recipe_id] = {
"chef_id": chef_id,
"cook_time": cook_time,
"ingredients": {},
"history": [],
}
return True
def add_ingredient(self, timestamp, recipe_id, ingredient_name, qty):
if recipe_id not in self.recipes:
return False
recipe = self.recipes[recipe_id]
if ingredient_name in recipe["ingredients"]:
return False
recipe["ingredients"][ingredient_name] = {"qty": qty, "expiry": None}
recipe["history"].append(ingredient_name)
return True
def remove_ingredient(self, timestamp, recipe_id, ingredient_name):
if recipe_id not in self.recipes:
return False
return self.recipes[recipe_id]["ingredients"].pop(ingredient_name, None) is not None
def get_recipe(self, timestamp, recipe_id):
if recipe_id not in self.recipes:
return None
return copy.deepcopy(self.recipes[recipe_id])
self.recipes = {
"r1": {
"chef_id": "chef1",
"cook_time": 30,
"ingredients": {
"flour": {"qty": 200, "expiry": None},
},
"history": ["flour"],
},
}
Step 1:create_recipe 起外層
Step 2:add_ingredient 放內層 sub-item
Step 3:history 屬於 recipe,自然擺 recipe 入面
def search_recipes_by_ingredient(self, timestamp, ingredient_name):
result = []
for recipe_id, recipe in self.recipes.items():
if ingredient_name in recipe["ingredients"]:
result.append(recipe_id)
return sorted(result)
def top_recipes_by_ingredient_count(self, timestamp, n):
rows = []
for recipe_id, recipe in self.recipes.items():
rows.append((len(recipe["ingredients"]), recipe_id))
rows.sort(key=lambda x: (-x[0], x[1]))
return [recipe_id for _, recipe_id in rows[:n]]
def get_chef_recipes(self, timestamp, chef_id):
return sorted([recipe_id for recipe_id, recipe in self.recipes.items() if recipe["chef_id"] == chef_id])
self.recipes = {
"cake": {
"chef_id": "chef1",
"ingredients": {"flour": {"qty": 200}, "milk": {"qty": 100}},
},
"soup": {
"chef_id": "chef2",
"ingredients": {"salt": {"qty": 5}},
},
}
ingredient_name / chef_id 都唔係外層 key
所以 search 同 filter 都要巡晒 recipes
ingredient_name / chef_id 都唔係外層 key
所以兩條都係 for loop 全部 recipes
排行前先計 ingredient count
def _purge_expired_ingredients(self, timestamp):
for recipe in self.recipes.values():
dead = []
for ingredient_name, ingredient in recipe["ingredients"].items():
if ingredient["expiry"] is not None and timestamp >= ingredient["expiry"]:
dead.append(ingredient_name)
for ingredient_name in dead:
del recipe["ingredients"][ingredient_name] # 到期即當呢樣材料用晒 / 壞咗
def add_ingredient_with_ttl(self, timestamp, recipe_id, ingredient_name, qty, ttl_ms):
self._purge_expired_ingredients(timestamp)
if not self.add_ingredient(timestamp, recipe_id, ingredient_name, qty):
return False
self.recipes[recipe_id]["ingredients"][ingredient_name]["expiry"] = timestamp + ttl_ms
return True
self.recipes = {
"cake": {
"ingredients": {
"flour": {"qty": 200, "expiry": None},
"milk": {"qty": 100, "expiry": 5000},
},
},
}
TTL 只貼喺 ingredient
recipe metadata 本身唔加 expiry
expiry 仍然係 ingredient field
唔係 recipe field
到期就由 ingredients dict 刪走
def get_recipe_history(self, timestamp, recipe_id):
self._purge_expired_ingredients(timestamp)
if recipe_id not in self.recipes:
return None
return list(self.recipes[recipe_id]["history"])
def backup(self, timestamp):
self._purge_expired_ingredients(timestamp)
snapshot = {}
for recipe_id, recipe in self.recipes.items():
snapshot[recipe_id] = {
"chef_id": recipe["chef_id"],
"cook_time": recipe["cook_time"],
"ingredients": {},
"history": list(recipe["history"]),
}
for ingredient_name, ingredient in recipe["ingredients"].items():
remaining_ttl = None if ingredient["expiry"] is None else ingredient["expiry"] - timestamp
snapshot[recipe_id]["ingredients"][ingredient_name] = {
"qty": ingredient["qty"],
"remaining_ttl": remaining_ttl,
}
self.backups.append((timestamp, snapshot))
return str(len(snapshot))
def restore(self, timestamp, restore_timestamp):
self._purge_expired_ingredients(timestamp)
candidate = None
for ts, snapshot in self.backups:
if ts <= restore_timestamp:
candidate = snapshot
if candidate is None:
return False
self.recipes = copy.deepcopy(candidate)
for recipe in self.recipes.values():
for ingredient in recipe["ingredients"].values():
remaining_ttl = ingredient.pop("remaining_ttl")
ingredient["expiry"] = None if remaining_ttl is None else timestamp + remaining_ttl
return True
self.backups = [
(5000, {
"cake": {
"chef_id": "chef1",
"cook_time": 30,
"ingredients": {
"milk": {"qty": 100, "remaining_ttl": 1200},
},
"history": ["flour", "milk"],
},
}),
]
self.recipes["cake"]["history"] = ["flour", "milk"]
backup 同其他 Family 2 一樣:
TTL field 要轉 remaining_ttl
history 繼續跟 recipe 走
async def batch_ingredients(self, timestamp, ops):
self._purge_expired_ingredients(timestamp)
async def execute_op(op):
recipe_id = op["recipe_id"]
async with self.locks[recipe_id]:
if op["type"] == "add":
return self.add_ingredient(timestamp, recipe_id, op["ingredient_name"], op["qty"])
if op["type"] == "remove":
return self.remove_ingredient(timestamp, recipe_id, op["ingredient_name"])
return False
return list(await asyncio.gather(*(execute_op(op) for op in ops)))
self.recipes = {
"cake": {"ingredients": {"flour": {"qty": 200}}},
"soup": {"ingredients": {"salt": {"qty": 5}}},
}
self.locks = {
"cake": <asyncio.Lock>,
"soup": <asyncio.Lock>,
}
呢題 practice 只用 Pattern A:
每個 op 都係鎖一個 recipe_id
呢題 practice 冇好 natural 嘅雙 recipe transfer
所以你最少要識單 recipe lock 版:
每個 op 鎖一個 recipe_id
async def sync_recipes(self, timestamp, recipe_ids, max_concurrent):
self._purge_expired_ingredients(timestamp)
sem = asyncio.Semaphore(max_concurrent)
async def do_one(recipe_id):
if recipe_id not in self.recipes:
return False
async with sem:
await asyncio.sleep(0.01)
self.recipes[recipe_id]["history"].append(f"sync@{timestamp}")
return True
return list(await asyncio.gather(*(do_one(recipe_id) for recipe_id in recipe_ids)))
持久 state:
self.recipes = {
"cake": {"history": ["flour", "milk", "sync@9000"]},
"soup": {"history": ["salt"]},
}
self.locks = defaultdict(asyncio.Lock)
臨時 runtime:
sem = asyncio.Semaphore(max_concurrent)
L6 冇新增 dict
只係 sync_recipes() 多咗限流
missing recipe → fail-fast
valid recipe → sem + sleep
成功後記 sync history
InitFamily 題 F。Family 3:policies 同 violations 係兩份唔同 dict。重點係 violation_id 本身就可以單獨操作,所以 B 一定要 flat。
import copy
import asyncio
from collections import defaultdict
class ComplianceAuditEngine:
def __init__(self):
self.policies = {}
self.violations = {}
self.backups = []
self.merged_entities = {}
self.history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock)
self.policies = {
"p1": {"description": "late filing", "max_violations": 3},
}
self.violations = {
"v1": {
"policy_id": "p1",
"entity_id": "e1",
"severity": 4,
"expiry": 8000,
"status": "ACTIVE",
},
}
self.history = {"e1": ["v1", "v2", "v3"]}
self.merged_entities = {"old_entity": "new_entity"}
register_policy() 起 policies
flag_violation() 起 violations
clear_violation(violation_id) 只帶 v_id
→ B dict 一定要 flat
def register_policy(self, timestamp, policy_id, description, max_violations):
if policy_id in self.policies:
return False
self.policies[policy_id] = {
"description": description,
"max_violations": max_violations,
}
return True
def flag_violation(self, timestamp, policy_id, entity_id, violation_id, severity):
if policy_id not in self.policies or violation_id in self.violations:
return False
self.violations[violation_id] = {
"policy_id": policy_id,
"entity_id": entity_id,
"severity": severity,
"expiry": None,
"status": "ACTIVE",
}
self.history[entity_id].append(violation_id) # 呢個 entity 底下又多咗一單 violation
return True
def clear_violation(self, timestamp, violation_id):
violation = self.violations.get(violation_id)
if violation is None or violation["status"] != "ACTIVE":
return False
violation["status"] = "CLEARED"
return True
def get_active_violations(self, timestamp, entity_id):
return sorted([
violation_id
for violation_id, violation in self.violations.items()
if violation["entity_id"] == entity_id and violation["status"] == "ACTIVE"
])
self.policies = {
"p1": {"description": "...", "max_violations": 3},
}
self.violations = {
"v1": {"policy_id": "p1", "entity_id": "e1", "severity": 4, "status": "ACTIVE"},
}
Step 1:A / B 分開兩份 dict
Step 2:flag_violation 寫入 flat violations
Step 3:entity 維度嘅 history 另開 defaultdict(list)
def find_violation(self, timestamp, violation_id):
violation = self.violations.get(violation_id)
if violation is None:
return None
return dict(violation)
def get_worst_entities(self, timestamp, n):
counts = defaultdict(int)
for violation in self.violations.values(): # 因為 entity 唔係外層 key,要行 violations 聚合
if violation["status"] == "ACTIVE":
counts[violation["entity_id"]] += 1
rows = [(count, entity_id) for entity_id, count in counts.items()]
rows.sort(key=lambda x: (-x[0], x[1]))
return [entity_id for _, entity_id in rows[:n]]
self.violations = {
"v1": {"entity_id": "alice", "severity": 4, "status": "ACTIVE"},
"v2": {"entity_id": "alice", "severity": 2, "status": "CLEARED"},
"v3": {"entity_id": "bob", "severity": 5, "status": "ACTIVE"},
}
find_violation("v1") → 直接查 self.violations["v1"]
get_worst_entities() → 掃 violations 聚合 entity_id
violation_id 可以直接 get
entity_id 要透過 for loop violations 聚合
呢個就係 Family 3 經常見到嘅 A/B 分工
def _purge_expired_violations(self, timestamp):
for violation in self.violations.values():
if violation["expiry"] is not None and timestamp >= violation["expiry"] and violation["status"] == "ACTIVE":
violation["status"] = "EXPIRED" # 呢題唔一定 del;有時只係轉 status 方便留 history
def flag_violation_with_ttl(self, timestamp, policy_id, entity_id, violation_id, severity, ttl_ms):
self._purge_expired_violations(timestamp)
if not self.flag_violation(timestamp, policy_id, entity_id, violation_id, severity):
return False
self.violations[violation_id]["expiry"] = timestamp + ttl_ms
return True
self.violations = {
"v1": {
"policy_id": "p1",
"entity_id": "e1",
"severity": 4,
"expiry": 8100,
"status": "ACTIVE",
},
"v2": {
"policy_id": "p1",
"entity_id": "e1",
"severity": 2,
"expiry": 5000,
"status": "EXPIRED",
},
}
TTL 跟 B dict violation 走
到期後多數改 status,唔係即刪
Step 1:TTL 擺 violations[B] 入面
Step 2:public method 開頭 purge
Step 3:到期後通常改 status,唔一定 delete
def backup(self, timestamp): # 影低 policies + violations + history + merge map
snapshot = { # 開一份完整 snapshot
"policies": copy.deepcopy(self.policies), # A dict 直接 deepcopy
"violations": {}, # B dict 要逐條轉 TTL
"history": copy.deepcopy(dict(self.history)), # defaultdict 先轉普通 dict 影相
"merged_entities": copy.deepcopy(self.merged_entities), # merge 對照表都要跟住 backup
}
for violation_id, violation in self.violations.items(): # 逐條 violation 砌 snapshot 版
item = dict(violation) # 淺 copy 一份先
if item["expiry"] is not None: # 有 TTL 先轉 remaining_ttl
item["remaining_ttl"] = item["expiry"] - timestamp
else:
item["remaining_ttl"] = None # 永久單就記 None
item.pop("expiry", None) # snapshot 唔留絕對時間
snapshot["violations"][violation_id] = item # 放返入 snapshot
self.backups.append((timestamp, snapshot)) # (ts, snapshot) 入 backups list
return str(len(snapshot["violations"])) # 返今次影咗幾多條 violation
def restore(self, timestamp, restore_timestamp): # 還原到某個時間點之前最近嗰張相
candidate = None # 暫時未揀中 snapshot
for ts, snapshot in self.backups: # 逐張 backup 睇
if ts <= restore_timestamp: # 合資格先可做候選
candidate = snapshot # 越後面越新,所以直接覆蓋
if candidate is None:
return False # 一張都冇 → restore 失敗
self.policies = copy.deepcopy(candidate["policies"]) # A dict 直接抄返
self.violations = {} # B dict 重新逐條建返
for violation_id, item in candidate["violations"].items(): # remaining_ttl 要轉返 expiry
row = dict(item) # 先 copy 一份 snapshot row
remaining_ttl = row.pop("remaining_ttl", None) # 攞返剩餘 TTL
row["expiry"] = None if remaining_ttl is None else timestamp + remaining_ttl # restore 當刻重新計 expiry
self.violations[violation_id] = row # 寫返 live violations
self.history = defaultdict(list, copy.deepcopy(candidate["history"])) # history 都還原返
self.merged_entities = copy.deepcopy(candidate["merged_entities"]) # merge map 一齊還原
return True # restore 成功
def get_violation_history(self, timestamp, entity_id): # 睇某個 entity 歷來中過咩 violation
return list(self.history.get(entity_id, [])) # copy 一份 list 返出去
def merge_entity(self, timestamp, from_entity_id, to_entity_id): # 將舊 entity 併入新 entity
for violation in self.violations.values(): # 掃 violations,改晒 owner
if violation["entity_id"] == from_entity_id:
violation["entity_id"] = to_entity_id
self.history[to_entity_id].extend(self.history.get(from_entity_id, [])) # 舊歷史搬去新 entity
self.history.pop(from_entity_id, None) # 舊 entity history 刪走
self.merged_entities[from_entity_id] = to_entity_id # 留底 merge path 方便追查
return True
self.backups = [
(5000, {
"policies": {"p1": {"description": "late filing", "max_violations": 3}},
"violations": {
"v1": {"policy_id": "p1", "entity_id": "e1", "severity": 4, "status": "ACTIVE", "remaining_ttl": 1200},
},
"history": {"e1": ["v1", "v2"]},
"merged_entities": {"old_entity": "new_entity"},
}),
]
self.history = {
"e1": ["v1", "v2"],
"e2": ["v9"],
}
self.merged_entities = {"old_entity": "new_entity"}
Step 1:backup 時 B dict TTL 轉 remaining_ttl
Step 2:restore 時 remaining_ttl 再按 restore 當刻轉返 expiry
Step 3:history 係跨多張 violation,所以用 self.history
Step 4:merge_entity 改嘅係 violations 入面嘅 entity_id
async def batch_audit(self, timestamp, ops):
async def execute_op(op):
if op["type"] in {"flag", "clear"}:
key = op["violation_id"] # 單 violation op 鎖 violation_id
async with self.locks[key]:
if op["type"] == "flag":
return self.flag_violation(timestamp, op["policy_id"], op["entity_id"], op["violation_id"], op["severity"])
return self.clear_violation(timestamp, op["violation_id"])
first, second = sorted([op["from_entity_id"], op["to_entity_id"]]) # move / transfer 會同時郁兩個 entity
async with self.locks[first]:
async with self.locks[second]:
violation = self.violations.get(op["violation_id"])
if violation is None or violation["entity_id"] != op["from_entity_id"]:
return False
violation["entity_id"] = op["to_entity_id"]
self.history[op["to_entity_id"]].append(op["violation_id"])
return True
return list(await asyncio.gather(*(execute_op(op) for op in ops)))
self.violations = {
"v1": {"entity_id": "alice", "status": "ACTIVE"},
"v2": {"entity_id": "bob", "status": "ACTIVE"},
}
self.history = {
"alice": ["v1"],
"bob": ["v2"],
}
self.locks = {
"v1": <asyncio.Lock>,
"alice": <asyncio.Lock>,
"bob": <asyncio.Lock>,
}
Pattern A 鎖 violation_id
Pattern B move entity 時鎖 from_entity_id + to_entity_id
flag / clear = Pattern A(鎖 violation_id)
move violation between entities = Pattern B(sorted entity pair-lock)
async def report_violations(self, timestamp, violation_ids, max_concurrent):
sem = asyncio.Semaphore(max_concurrent)
async def do_one(violation_id):
violation = self.violations.get(violation_id)
if violation is None or violation["status"] != "ACTIVE":
return False
async with sem:
await asyncio.sleep(0.01)
violation["status"] = "REPORTED"
return True
return list(await asyncio.gather(*(do_one(violation_id) for violation_id in violation_ids)))
持久 state:
self.violations = {
"v1": {"status": "ACTIVE"},
"v2": {"status": "REPORTED"},
}
self.history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock)
臨時 runtime:
sem = asyncio.Semaphore(max_concurrent)
report_violations 成功後只係改:
self.violations[violation_id]["status"] = "REPORTED"
missing / non-active violation → fail-fast
過關先 sem + sleep
成功後 mark REPORTED
InitFamily 題 G。Family 3:queue 係 A,report 係 B。因為 claim_report(report_id) 只帶 report_id,所以 reports 一定 flat。
import copy
import asyncio
from collections import defaultdict
class ContentModerationPipeline:
def __init__(self):
self.queues = {}
self.reports = {}
self.backups = []
self.history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock)
self.queues = {
"q1": {"priority_level": 3},
}
self.reports = {
"r1": {
"queue_id": "q1",
"content_id": "c1",
"moderator_id": "mod1",
"reason": "spam",
"decision": None,
"expiry": 8000,
"status": "CLAIMED",
},
}
self.history = {"c1": ["r1", "r2"]}
queue metadata 擺 A
每張 report 自己獨立擺 B
content 維度 history 再另開 defaultdict(list)
def create_queue(self, timestamp, queue_id, priority_level):
if queue_id in self.queues:
return False
self.queues[queue_id] = {"priority_level": priority_level}
return True
def submit_report(self, timestamp, queue_id, report_id, content_id, reason):
if queue_id not in self.queues or report_id in self.reports:
return False
self.reports[report_id] = {
"queue_id": queue_id,
"content_id": content_id,
"moderator_id": None,
"reason": reason,
"decision": None,
"expiry": None,
"status": "PENDING",
}
self.history[content_id].append(report_id)
return True
def claim_report(self, timestamp, report_id, moderator_id):
report = self.reports.get(report_id)
if report is None or report["status"] != "PENDING":
return False
report["moderator_id"] = moderator_id
report["status"] = "CLAIMED"
return True
def resolve_report(self, timestamp, report_id, decision):
report = self.reports.get(report_id)
if report is None or report["status"] not in {"PENDING", "CLAIMED"}:
return False
report["decision"] = decision
report["status"] = "RESOLVED"
return True
self.queues = {
"q1": {"priority_level": 3},
}
self.reports = {
"r1": {
"queue_id": "q1",
"content_id": "c1",
"moderator_id": None,
"reason": "spam",
"decision": None,
"status": "PENDING",
},
}
self.history = {"c1": ["r1"]}
Step 1:create_queue 起 A
Step 2:submit_report 起 flat B
Step 3:claim / resolve 都係直打 report_id
→ 所以 reports 一定要 flat dict
def find_report(self, timestamp, report_id):
report = self.reports.get(report_id)
return None if report is None else dict(report)
def get_moderator_workload(self, timestamp, moderator_id):
return sum(1 for report in self.reports.values() if report["moderator_id"] == moderator_id and report["status"] == "CLAIMED")
def get_busiest_queues(self, timestamp, n):
counts = defaultdict(int)
for report in self.reports.values():
if report["status"] in {"PENDING", "CLAIMED"}:
counts[report["queue_id"]] += 1
rows = [(count, queue_id) for queue_id, count in counts.items()]
rows.sort(key=lambda x: (-x[0], x[1]))
return [queue_id for _, queue_id in rows[:n]]
self.reports = {
"r1": {"queue_id": "q1", "moderator_id": "mod1", "status": "CLAIMED"},
"r2": {"queue_id": "q1", "moderator_id": "mod1", "status": "CLAIMED"},
"r3": {"queue_id": "q2", "moderator_id": None, "status": "PENDING"},
}
find_report("r1") → 直接查 report_id
moderator workload / busiest queues → 掃 reports 聚合
report_id 可以直接 find
moderator / queue workload 要行 reports 聚合
def _purge_expired_reports(self, timestamp):
for report in self.reports.values():
if report["expiry"] is not None and timestamp >= report["expiry"] and report["status"] == "PENDING":
report["status"] = "EXPIRED"
def submit_report_with_ttl(self, timestamp, queue_id, report_id, content_id, reason, ttl_ms):
self._purge_expired_reports(timestamp)
if not self.submit_report(timestamp, queue_id, report_id, content_id, reason):
return False
self.reports[report_id]["expiry"] = timestamp + ttl_ms
return True
def get_report_age(self, timestamp, report_id):
report = self.reports.get(report_id)
if report is None:
return None
return None if report["status"] == "EXPIRED" else timestamp
self.reports = {
"r1": {
"queue_id": "q1",
"content_id": "c1",
"expiry": 8100,
"status": "PENDING",
},
"r2": {
"queue_id": "q2",
"content_id": "c9",
"expiry": 5000,
"status": "EXPIRED",
},
}
TTL 跟 report 走
到期保留單,但 status 轉 EXPIRED
TTL 跟 report 走
到期多數轉 status,而唔係即刻刪
呢類 moderation 題成日要保留審計痕跡
def get_content_history(self, timestamp, content_id): # 睇某份 content 被 report 過幾多次
return list(self.history.get(content_id, [])) # copy 一份 history list 返出去
def escalate(self, timestamp, report_id, from_queue_id, to_queue_id): # 將 report 升級去另一條 queue
report = self.reports.get(report_id) # 先搵條 report
if report is None or report["queue_id"] != from_queue_id:
return False # 唔存在或者唔喺原 queue → fail
report["queue_id"] = to_queue_id # 改去新 queue
report["status"] = "ESCALATED" # 標記已升級
return True
def backup(self, timestamp): # 影低 queues + reports + history
snapshot = {"queues": copy.deepcopy(self.queues), "reports": {}, "history": copy.deepcopy(dict(self.history))}
for report_id, report in self.reports.items(): # 逐張 report 轉 TTL
item = dict(report)
item["remaining_ttl"] = None if item["expiry"] is None else item["expiry"] - timestamp
item.pop("expiry", None)
snapshot["reports"][report_id] = item
self.backups.append((timestamp, snapshot)) # 入 backups list
return str(len(snapshot["reports"])) # 返 snapshot 入面 report 數量
def restore(self, timestamp, restore_timestamp): # 還原到某張 backup
candidate = None # 暫時未揀中
for ts, snapshot in self.backups:
if ts <= restore_timestamp:
candidate = snapshot # 揀 restore_ts 之前最近嗰張
if candidate is None:
return False
self.queues = copy.deepcopy(candidate["queues"]) # A dict 直接抄返
self.reports = {} # B dict 逐張重建
for report_id, item in candidate["reports"].items():
row = dict(item)
remaining_ttl = row.pop("remaining_ttl", None) # 攞返 snapshot 裏面 TTL
row["expiry"] = None if remaining_ttl is None else timestamp + remaining_ttl
self.reports[report_id] = row
self.history = defaultdict(list, copy.deepcopy(candidate["history"])) # content history 一齊還原
return True
self.backups = [
(5000, {
"queues": {"q1": {"priority_level": 3}},
"reports": {
"r1": {"queue_id": "q1", "content_id": "c1", "status": "CLAIMED", "remaining_ttl": 1200},
},
"history": {"c1": ["r1"]},
}),
]
self.history = {"c1": ["r1", "r2"]}
Step 1:backup 時 report TTL 轉 remaining_ttl
Step 2:restore 時按 restore 當刻重算 expiry
Step 3:content history 係跨 report,所以 self.history[content_id]
Step 4:escalate 其實係改 report 入面 queue_id
async def batch_moderate(self, timestamp, ops):
async def execute_op(op):
if op["type"] in {"submit", "claim", "resolve"}:
key = op["report_id"]
async with self.locks[key]:
if op["type"] == "submit":
return self.submit_report(timestamp, op["queue_id"], op["report_id"], op["content_id"], op["reason"])
if op["type"] == "claim":
return self.claim_report(timestamp, op["report_id"], op["moderator_id"])
return self.resolve_report(timestamp, op["report_id"], op["decision"])
first, second = sorted([op["from_queue_id"], op["to_queue_id"]])
async with self.locks[first]:
async with self.locks[second]:
return self.escalate(timestamp, op["report_id"], op["from_queue_id"], op["to_queue_id"])
return list(await asyncio.gather(*(execute_op(op) for op in ops)))
self.reports = {
"r1": {"queue_id": "q1", "status": "CLAIMED"},
"r2": {"queue_id": "q2", "status": "PENDING"},
}
self.locks = {
"r1": <asyncio.Lock>,
"q1": <asyncio.Lock>,
"q2": <asyncio.Lock>,
}
submit / claim / resolve 鎖 report_id
escalate 會同時用到 from_queue_id + to_queue_id
單 report op → 鎖 report_id
escalate 會同時掂兩條 queue → sorted pair-lock
async def send_decisions(self, timestamp, report_ids, max_concurrent):
sem = asyncio.Semaphore(max_concurrent)
async def do_one(report_id):
report = self.reports.get(report_id)
if report is None or report["status"] != "RESOLVED":
return False
async with sem:
await asyncio.sleep(0.01)
report["status"] = "NOTIFIED"
return True
return list(await asyncio.gather(*(do_one(report_id) for report_id in report_ids)))
持久 state:
self.reports = {
"r1": {"status": "RESOLVED"},
"r2": {"status": "NOTIFIED"},
}
self.history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock)
臨時 runtime:
sem = asyncio.Semaphore(max_concurrent)
send_decisions 成功後會改:
report["status"] = "NOTIFIED"
未 resolved → fail-fast
resolved → sem + sleep
成功後 mark NOTIFIED
InitFamily 題 H。Family 3:band 係 A,lease 係 B。lease_id 自己可以 revoke / transfer,所以 leases 要獨立 flat dict。
import copy
import asyncio
from collections import defaultdict
class FrequencySpectrumAllocator:
def __init__(self):
self.bands = {}
self.leases = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
self.bands = {
"b1": {"freq_start": 700, "freq_end": 710, "history": ["lease1", "lease2"]},
}
self.leases = {
"lease1": {
"band_id": "b1",
"operator_id": "op1",
"expiry": 8000,
"status": "ACTIVE",
},
}
band metadata 擺 A
lease record 擺 B
band history 跟 band 自己走
唔使再開第 3 份 history dict
def register_band(self, timestamp, band_id, freq_start, freq_end):
if band_id in self.bands:
return False
self.bands[band_id] = {"freq_start": freq_start, "freq_end": freq_end, "history": []}
return True
def lease(self, timestamp, band_id, operator_id, lease_id):
if band_id not in self.bands or lease_id in self.leases:
return False
self.leases[lease_id] = {
"band_id": band_id,
"operator_id": operator_id,
"expiry": None,
"status": "ACTIVE",
}
self.bands[band_id]["history"].append(lease_id)
return True
def revoke(self, timestamp, lease_id):
lease = self.leases.get(lease_id)
if lease is None or lease["status"] != "ACTIVE":
return False
lease["status"] = "REVOKED"
return True
def get_band_status(self, timestamp, band_id):
return sorted([
lease_id
for lease_id, lease in self.leases.items()
if lease["band_id"] == band_id and lease["status"] == "ACTIVE"
])
self.bands = {
"b1": {"freq_start": 700, "freq_end": 710, "history": ["lease1"]},
}
self.leases = {
"lease1": {
"band_id": "b1",
"operator_id": "op1",
"status": "ACTIVE",
},
}
Step 1:register_band 起 A
Step 2:lease 起 flat B
Step 3:revoke 直接打 lease_id
→ leases 一定係 flat dict
def get_operator_bands(self, timestamp, operator_id):
rows = []
for lease in self.leases.values():
if lease["operator_id"] == operator_id and lease["status"] == "ACTIVE":
rows.append(lease["band_id"])
return sorted(rows)
def get_available_bands(self, timestamp):
leased = {
lease["band_id"]
for lease in self.leases.values()
if lease["status"] == "ACTIVE"
}
return sorted([band_id for band_id in self.bands if band_id not in leased])
self.bands = {
"b1": {"freq_start": 700, "freq_end": 710},
"b2": {"freq_start": 720, "freq_end": 730},
}
self.leases = {
"lease1": {"band_id": "b1", "operator_id": "op1", "status": "ACTIVE"},
}
get_operator_bands("op1") → 掃 leases 搵 operator_id
get_available_bands() → bands 減走 active lease 用緊嗰啲 band_id
operator_id 唔係外層 key
所以都係行 leases[B] 去 filter
def _purge_expired_leases(self, timestamp):
for lease in self.leases.values():
if lease["expiry"] is not None and timestamp >= lease["expiry"] and lease["status"] == "ACTIVE":
lease["status"] = "EXPIRED"
def lease_with_ttl(self, timestamp, band_id, operator_id, lease_id, ttl_ms):
self._purge_expired_leases(timestamp)
if not self.lease(timestamp, band_id, operator_id, lease_id):
return False
self.leases[lease_id]["expiry"] = timestamp + ttl_ms
return True
def get_remaining_lease(self, timestamp, lease_id):
lease = self.leases.get(lease_id)
if lease is None or lease["expiry"] is None:
return None
return max(0, lease["expiry"] - timestamp)
self.leases = {
"lease1": {
"band_id": "b1",
"operator_id": "op1",
"expiry": 8100,
"status": "ACTIVE",
},
"lease2": {
"band_id": "b2",
"operator_id": "op9",
"expiry": 5000,
"status": "EXPIRED",
},
}
TTL 擺喺 lease
band 本身只記 metadata + history
TTL 跟 lease 走
到期後通常改 status = EXPIRED
band metadata 唔會直接加 expiry
def get_lease_history(self, timestamp, band_id): # 睇某條 band 歷來租出去過邊啲 lease
if band_id not in self.bands:
return None # band 都冇就答 None
return list(self.bands[band_id]["history"]) # copy 一份 history list
def transfer_lease(self, timestamp, lease_id, new_operator_id): # 將 lease 轉手畀另一個 operator
lease = self.leases.get(lease_id) # 先搵 lease
if lease is None or lease["status"] != "ACTIVE":
return False # 唔存在或者已失效 → 唔轉得
lease["operator_id"] = new_operator_id # 真正轉手只係改 operator_id
return True
def backup(self, timestamp): # 影低 bands + leases
snapshot = {"bands": copy.deepcopy(self.bands), "leases": {}}
for lease_id, lease in self.leases.items(): # lease TTL 逐張轉 remaining_ttl
item = dict(lease)
item["remaining_ttl"] = None if item["expiry"] is None else item["expiry"] - timestamp
item.pop("expiry", None)
snapshot["leases"][lease_id] = item
self.backups.append((timestamp, snapshot)) # 入 backups list
return str(len(snapshot["leases"])) # 回覆 snapshot 包咗幾張 lease
def restore(self, timestamp, restore_timestamp): # 還原到某個 snapshot 時刻
candidate = None # 暫時未揀中 backup
for ts, snapshot in self.backups:
if ts <= restore_timestamp:
candidate = snapshot # 揀 restore_ts 之前最近嗰張
if candidate is None:
return False
self.bands = copy.deepcopy(candidate["bands"]) # A dict 直接還原
self.leases = {} # B dict 逐張 rebuild
for lease_id, item in candidate["leases"].items():
row = dict(item)
remaining_ttl = row.pop("remaining_ttl", None) # 取出 snapshot TTL
row["expiry"] = None if remaining_ttl is None else timestamp + remaining_ttl
self.leases[lease_id] = row
return True
self.backups = [
(5000, {
"bands": {"b1": {"freq_start": 700, "freq_end": 710, "history": ["lease1"]}},
"leases": {"lease1": {"band_id": "b1", "operator_id": "op1", "status": "ACTIVE", "remaining_ttl": 1200}},
}),
]
self.bands["b1"]["history"] = ["lease1", "lease2"]
Step 1:backup TTL 一樣要轉 remaining_ttl
Step 2:restore 時由 remaining_ttl 重算 expiry
Step 3:history 跟 band 走
Step 4:operator 轉手其實係改 lease[B] 入面 operator_id
async def batch_ops(self, timestamp, ops):
async def execute_op(op):
if op["type"] in {"lease", "revoke"}:
key = op["lease_id"]
async with self.locks[key]:
if op["type"] == "lease":
return self.lease(timestamp, op["band_id"], op["operator_id"], op["lease_id"])
return self.revoke(timestamp, op["lease_id"])
first, second = sorted([op["from_band_id"], op["to_band_id"]])
async with self.locks[first]:
async with self.locks[second]:
lease = self.leases.get(op["lease_id"])
if lease is None or lease["band_id"] != op["from_band_id"]:
return False
lease["band_id"] = op["to_band_id"]
self.bands[op["to_band_id"]]["history"].append(op["lease_id"])
return True
return list(await asyncio.gather(*(execute_op(op) for op in ops)))
self.leases = {
"lease1": {"band_id": "b1", "status": "ACTIVE"},
"lease2": {"band_id": "b2", "status": "REVOKED"},
}
self.locks = {
"lease1": <asyncio.Lock>,
"b1": <asyncio.Lock>,
"b2": <asyncio.Lock>,
}
單 lease op 鎖 lease_id
搬 lease 去另一條 band 時鎖 from_band_id + to_band_id
lease / revoke = 單 lease lock
move lease between bands = sorted band pair-lock
async def sync_bands(self, timestamp, band_ids, max_concurrent):
sem = asyncio.Semaphore(max_concurrent)
async def do_one(band_id):
if band_id not in self.bands:
return False
async with sem:
await asyncio.sleep(0.01)
self.bands[band_id]["history"].append(f"sync@{timestamp}")
return True
return list(await asyncio.gather(*(do_one(band_id) for band_id in band_ids)))
持久 state:
self.bands = {
"b1": {"history": ["lease1", "sync@9000"]},
"b2": {"history": ["lease9"]},
}
self.leases = {
"lease1": {"status": "ACTIVE"},
}
self.locks = defaultdict(asyncio.Lock)
臨時 runtime:
sem = asyncio.Semaphore(max_concurrent)
missing band → fail-fast
valid band → sem + sleep
成功後記 history / synced event
Practice mock。Family 3:cases 同 docs 兩份 flat dict。重點唔止 TTL,仲有 status machine:DRAFT -> REVIEW -> APPROVED/REJECTED,而且 reject() 會 rollback 同 case 入面其他已批文件。
import asyncio
from collections import defaultdict
class DocumentApprovalWorkflow:
def __init__(self):
self.cases = {} # case_id -> {"client_name": str, "history": []}
self.docs = {} # doc_id -> {"case_id", "title", "reviewer_id", "status", "expiry"}
self.backups = [] # L4 backup list
self.locks = defaultdict(asyncio.Lock) # L5 per-doc lock
def __init__(self):
self.cases = {}
self.docs = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
self.cases = {
"case1": {"client_name": "Acme", "history": ["100: docA added"]},
}
self.docs = {
"docA": {
"case_id": "case1",
"title": "MSA",
"reviewer_id": "rachel",
"status": "REVIEW",
"expiry": 5100,
},
}
self.backups = []
self.locks = {"docA": <asyncio.Lock>}
def create_case(self, timestamp, case_id, client_name): # 開新 case
self._purge(timestamp) # 開頭先清過期 doc
if case_id in self.cases: # case_id 已存在就唔開
return False
self.cases[case_id] = {"client_name": client_name, "history": []} # A dict 起 case
return True
def add_document(self, timestamp, case_id, doc_id, title, ttl_ms=None): # 加 doc 入 case
self._purge(timestamp)
if case_id not in self.cases or doc_id in self.docs:
return False # case 唔存在 / doc 重複都唔得
expiry = timestamp + ttl_ms if ttl_ms is not None else None # L3:有 ttl_ms 就即場計 expiry
self.docs[doc_id] = { # B dict flat 存 doc
"case_id": case_id,
"title": title,
"reviewer_id": None,
"status": "DRAFT",
"expiry": expiry,
}
self.cases[case_id]["history"].append(f"{timestamp}: {doc_id} added") # history 跟 case 走
return True
def assign_reviewer(self, timestamp, doc_id, reviewer_id): # 指派 reviewer
self._purge(timestamp)
if doc_id not in self.docs:
return None
self.docs[doc_id]["reviewer_id"] = reviewer_id # 直接改 B dict
return True
def submit_for_review(self, timestamp, doc_id): # DRAFT -> REVIEW
self._purge(timestamp)
if doc_id not in self.docs:
return None
if self.docs[doc_id]["status"] != "DRAFT":
return False
self._set_status(doc_id, "REVIEW", timestamp) # 同步記入 case history
return True
def approve(self, timestamp, doc_id): # REVIEW -> APPROVED
self._purge(timestamp)
if doc_id not in self.docs:
return None
if self.docs[doc_id]["status"] != "REVIEW":
return False
self._set_status(doc_id, "APPROVED", timestamp)
return True
def reject(self, timestamp, doc_id): # REVIEW -> REJECTED + rollback
self._purge(timestamp)
if doc_id not in self.docs:
return None
doc = self.docs[doc_id]
if doc["status"] != "REVIEW":
return False
self._set_status(doc_id, "REJECTED", timestamp) # 先 reject 自己
case_id = doc["case_id"] # 再搵返同一 case
for other_id, other in self.docs.items():
if other["case_id"] == case_id and other["status"] == "APPROVED":
self._set_status(other_id, "DRAFT", timestamp) # rollback:批過嘅通通打返草稿
return True
self.cases = {
"case1": {"client_name": "Acme", "history": ["100: docA added"]},
}
self.docs = {
"docA": {"case_id": "case1", "title": "MSA", "reviewer_id": "rachel", "status": "DRAFT", "expiry": None},
"docB": {"case_id": "case1", "title": "NDA", "reviewer_id": None, "status": "REVIEW", "expiry": None},
}
Step 1:A dict 開 case,B dict flat 開 doc
Step 2:assign / submit / approve / reject 都係直打 doc_id
Step 3:status machine 由 doc["status"] 控制
Step 4:reject 最關鍵係 rollback 同 case 其他 APPROVED doc
def get_case_status(self, timestamp, case_id): # 數一個 case 入面各 status 幾多份
self._purge(timestamp)
if case_id not in self.cases:
return None
counts = {}
for doc in self.docs.values(): # entity 唔喺 cases 入面,要掃 B dict 聚合
if doc["case_id"] == case_id:
status = doc["status"]
counts[status] = counts.get(status, 0) + 1
return counts
def get_reviewer_workload(self, timestamp, reviewer_id): # reviewer 手上幾多份 REVIEW doc
self._purge(timestamp)
count = 0
for doc in self.docs.values():
if doc["reviewer_id"] == reviewer_id and doc["status"] == "REVIEW":
count += 1
return count
self.docs = {
"docA": {"case_id": "case1", "reviewer_id": "rachel", "status": "REVIEW"},
"docB": {"case_id": "case1", "reviewer_id": "rachel", "status": "APPROVED"},
"docC": {"case_id": "case2", "reviewer_id": "ken", "status": "REVIEW"},
}
Step 1:case_id / reviewer_id 都唔係 B dict key
Step 2:所以 L2 一定係掃 self.docs.values() 聚合
Step 3:想分 status,就自己開 counts dict 慢慢加
def _purge(self, timestamp): # 清走過期 doc
expired = [
doc_id for doc_id, doc in self.docs.items()
if doc["expiry"] is not None and doc["expiry"] <= timestamp
]
for doc_id in expired:
del self.docs[doc_id] # 呢題到期係直接 delete,唔係改 status
def add_document(self, timestamp, case_id, doc_id, title, ttl_ms=None): # L1 method 到 L3 先加 TTL
self._purge(timestamp)
if case_id not in self.cases or doc_id in self.docs:
return False
expiry = (timestamp + ttl_ms) if ttl_ms is not None else None # 只有加 doc 嗰陣先寫 expiry
self.docs[doc_id] = {
"case_id": case_id,
"title": title,
"reviewer_id": None,
"status": "DRAFT",
"expiry": expiry,
}
return True
self.docs = {
"docA": {"case_id": "case1", "status": "DRAFT", "expiry": 5100},
"docB": {"case_id": "case1", "status": "REVIEW", "expiry": None},
}
Step 1:TTL 跟 doc 走,所以擺 B dict
Step 2:每個 public method 開頭先 _purge(timestamp)
Step 3:呢題過期係 delete doc,本身 case history 保留喺 A dict
def backup(self, timestamp): # 影低 cases + docs
self._purge(timestamp)
cases_snapshot = {
cid: {"client_name": c["client_name"], "history": list(c["history"])}
for cid, c in self.cases.items()
}
docs_snapshot = {}
for did, doc in self.docs.items():
remaining = (doc["expiry"] - timestamp) if doc["expiry"] is not None else None
docs_snapshot[did] = {
"case_id": doc["case_id"],
"title": doc["title"],
"reviewer_id": doc["reviewer_id"],
"status": doc["status"],
"remaining_ttl": remaining,
}
self.backups.append({"timestamp": timestamp, "cases": cases_snapshot, "docs": docs_snapshot})
return True
def restore(self, timestamp, target_ts): # 還原 target_ts 嗰張相
snap = None
for backup in self.backups:
if backup["timestamp"] == target_ts:
snap = backup
break
if snap is None:
return False
self.cases = {
cid: {"client_name": c["client_name"], "history": list(c["history"])}
for cid, c in snap["cases"].items()
}
self.docs = {}
for did, doc in snap["docs"].items():
expiry = timestamp + doc["remaining_ttl"] if doc["remaining_ttl"] is not None else None
self.docs[did] = {
"case_id": doc["case_id"],
"title": doc["title"],
"reviewer_id": doc["reviewer_id"],
"status": doc["status"],
"expiry": expiry,
}
return True
def get_case_history(self, timestamp, case_id): # history 跟 case 走
self._purge(timestamp)
if case_id not in self.cases:
return None
return list(self.cases[case_id]["history"])
self.backups = [
{
"timestamp": 5000,
"cases": {"case1": {"client_name": "Acme", "history": ["100: docA added"]}},
"docs": {
"docA": {"case_id": "case1", "title": "MSA", "reviewer_id": "rachel", "status": "REVIEW", "remaining_ttl": 1200},
},
},
]
self.cases["case1"]["history"] = [
"100: docA added",
"200: docA DRAFT->REVIEW",
]
Step 1:backup 時 doc expiry 轉 remaining_ttl
Step 2:restore 時再用 restore 當刻計返 expiry
Step 3:history 跟 case 走,因為 rollback / status 事件都要聚合返同一個 case
async def batch_ops(self, timestamp, operations): # batch 做 add / submit / approve / reject
results = []
for op in operations: # 保持 input 順序
op_type = op["type"]
doc_id = op.get("doc_id")
lock_key = doc_id if doc_id else op.get("case_id", "global") # add 未必已有 doc,就退返去鎖 case_id
async with self.locks[lock_key]: # 同一份 doc / case 嘅 op 排隊
if op_type == "add":
res = self.add_document(timestamp, op["case_id"], op["doc_id"], op["title"], ttl_ms=op.get("ttl_ms"))
elif op_type == "submit":
res = self.submit_for_review(timestamp, doc_id)
elif op_type == "approve":
res = self.approve(timestamp, doc_id)
elif op_type == "reject":
res = self.reject(timestamp, doc_id)
else:
res = None
results.append(res) # output 同 input 對位
return results
self.docs = {
"docA": {"status": "REVIEW"},
"docB": {"status": "APPROVED"},
}
self.locks = {
"docA": <asyncio.Lock>,
"docB": <asyncio.Lock>,
"case1": <asyncio.Lock>,
}
Step 1:單 doc op 用 doc_id 做 lock key
Step 2:add 之前未有 doc,就暫時鎖 case_id
Step 3:鎖入面直接 call 舊 sync method
Step 4:results list 保持 input output 對位
async def notify_reviewers(self, timestamp, doc_ids, max_concurrent): # 通知 reviewer,先 fail-fast 再 sem
self._purge(timestamp)
for doc_id in doc_ids: # Phase 1:同步驗證
if doc_id not in self.docs:
raise ValueError(f"doc_id={doc_id} not found") # 唔存在即刻炸,唔做後面
if self.docs[doc_id]["status"] == "DRAFT":
raise ValueError(f"doc_id={doc_id} is DRAFT, cannot notify") # DRAFT 未交審,唔應該通知
sem = asyncio.Semaphore(max_concurrent) # Phase 2:真正 notify 先限流
notified = []
async def _notify_one(doc_id):
async with sem:
await asyncio.sleep(0.01) # 模擬網絡延遲
notified.append(doc_id)
tasks = []
for doc_id in doc_ids:
if self.docs[doc_id]["status"] == "REVIEW": # 只 notify REVIEW 狀態
tasks.append(_notify_one(doc_id))
await asyncio.gather(*tasks)
return notified
持久 state:
self.docs = {
"docA": {"status": "REVIEW"},
"docB": {"status": "APPROVED"},
}
self.locks = defaultdict(asyncio.Lock)
臨時 runtime:
sem = asyncio.Semaphore(max_concurrent)
Step 1:doc 唔存在 / 仲係 DRAFT 就 fail-fast
Step 2:過關先 sem + sleep
Step 3:只通知 REVIEW 狀態;APPROVED 唔會再 send
Practice mock。Family 3:plans 同 subs 兩份 flat dict。重點係 status machine:ACTIVE / PAUSED / CANCELLED,同埋 L4 migrated_plans chain + customer payment history。
import asyncio
from collections import defaultdict
class SubscriptionBilling:
def __init__(self):
self.plans = {} # plan_id -> {"price_per_month": N}
self.subs = {} # sub_id -> {"plan_id", "customer_id", "status", "expiry"}
self.backups = [] # L4 backup
self.history = defaultdict(list) # customer_id -> payment records
self.migrated_plans = {} # old_plan -> new_plan
self.locks = defaultdict(asyncio.Lock)
def __init__(self):
self.plans = {}
self.subs = {}
self.backups = []
self.history = defaultdict(list)
self.migrated_plans = {}
self.locks = defaultdict(asyncio.Lock)
self.plans = {
"basic": {"price_per_month": 10},
"pro": {"price_per_month": 25},
}
self.subs = {
"sub1": {"plan_id": "basic", "customer_id": "custA", "status": "ACTIVE", "expiry": 5100},
}
self.history = {"custA": [{"subscription_id": "sub1", "amount": 10, "timestamp": 5000}]}
self.migrated_plans = {"legacy": "basic"}
def create_plan(self, timestamp, plan_id, price_per_month): # 開 plan
self.plans[plan_id] = {"price_per_month": price_per_month}
return True
def subscribe(self, timestamp, plan_id, subscription_id, customer_id, ttl_ms=None): # 建 subscription
resolved = self._resolve_plan(plan_id) # plan 可能已 migrate,要先追鏈
if resolved not in self.plans or subscription_id in self.subs:
return False
expiry = timestamp + ttl_ms if ttl_ms is not None else None
self.subs[subscription_id] = {
"plan_id": resolved,
"customer_id": customer_id,
"status": "ACTIVE",
"expiry": expiry,
}
return True
def cancel(self, timestamp, subscription_id): # ACTIVE / PAUSED -> CANCELLED
self._purge_expired(timestamp)
if subscription_id not in self.subs:
return False
sub = self.subs[subscription_id]
if sub["status"] not in ("ACTIVE", "PAUSED"):
return False
sub["status"] = "CANCELLED"
return True
def pause(self, timestamp, subscription_id): # ACTIVE -> PAUSED
self._purge_expired(timestamp)
if subscription_id not in self.subs or self.subs[subscription_id]["status"] != "ACTIVE":
return False
self.subs[subscription_id]["status"] = "PAUSED"
return True
def resume(self, timestamp, subscription_id): # PAUSED -> ACTIVE
self._purge_expired(timestamp)
if subscription_id not in self.subs or self.subs[subscription_id]["status"] != "PAUSED":
return False
self.subs[subscription_id]["status"] = "ACTIVE"
return True
self.plans = {"basic": {"price_per_month": 10}}
self.subs = {
"sub1": {"plan_id": "basic", "customer_id": "custA", "status": "ACTIVE", "expiry": None},
"sub2": {"plan_id": "basic", "customer_id": "custB", "status": "PAUSED", "expiry": None},
}
Step 1:A dict 開 plan,B dict 開 subscription
Step 2:status machine 主要圍住 sub["status"] 轉
Step 3:subscribe 前先 resolve migrate 過嘅 plan chain
def get_plan_subscribers(self, timestamp, plan_id): # 數某 plan 有幾多 ACTIVE sub
self._purge_expired(timestamp)
resolved = self._resolve_plan(plan_id)
count = 0
for sub in self.subs.values():
if sub["plan_id"] == resolved and sub["status"] == "ACTIVE":
count += 1
return count
def get_customer_subscriptions(self, timestamp, customer_id): # 搵 customer 全部 sub_id
self._purge_expired(timestamp)
result = []
for sub_id, sub in self.subs.items():
if sub["customer_id"] == customer_id:
result.append(sub_id)
return result
self.subs = {
"sub1": {"plan_id": "basic", "customer_id": "custA", "status": "ACTIVE"},
"sub2": {"plan_id": "basic", "customer_id": "custA", "status": "PAUSED"},
"sub3": {"plan_id": "pro", "customer_id": "custB", "status": "ACTIVE"},
}
Step 1:customer_id / plan_id 都唔係 B dict key
Step 2:所以要掃 self.subs.items() 聚合
Step 3:計 plan subscriber 時只數 ACTIVE
def _purge_expired(self, timestamp): # TTL 到期即自動 cancel
expired = [
sid for sid, sub in self.subs.items()
if sub["expiry"] is not None and timestamp >= sub["expiry"] and sub["status"] != "CANCELLED"
]
for sid in expired:
self.subs[sid]["status"] = "CANCELLED" # 呢題過期唔 delete,係改 status
def _resolve_plan(self, plan_id): # 跟 migrated_plans chain 搵最新 plan
visited = set()
while plan_id in self.migrated_plans:
if plan_id in visited:
break
visited.add(plan_id)
plan_id = self.migrated_plans[plan_id]
return plan_id
self.subs = {
"sub1": {"plan_id": "basic", "status": "ACTIVE", "expiry": 5100},
"sub2": {"plan_id": "basic", "status": "CANCELLED", "expiry": 5000},
}
self.migrated_plans = {
"legacy-basic": "basic",
"starter": "legacy-basic",
}
Step 1:TTL 跟 subscription 走,所以擺 B dict
Step 2:過期後保留記錄,但 status 改 CANCELLED
Step 3:凡係帶 plan_id 入嚟,都要先 resolve migration chain
def backup(self, timestamp): # 影低 plans / subs / history / migrated_plans
self._purge_expired(timestamp)
subs_snapshot = {}
for sid, sub in self.subs.items():
remaining = sub["expiry"] - timestamp if sub["expiry"] is not None else None
subs_snapshot[sid] = {
"plan_id": sub["plan_id"],
"customer_id": sub["customer_id"],
"status": sub["status"],
"remaining_ttl": remaining,
}
snapshot = {
"timestamp": timestamp,
"plans": {pid: dict(plan) for pid, plan in self.plans.items()},
"subs": subs_snapshot,
"history": {cid: list(records) for cid, records in self.history.items()},
"migrated_plans": dict(self.migrated_plans),
}
self.backups.append(snapshot)
return True
def restore(self, timestamp, target_ts): # 找指定 backup timestamp 還原
snapshot = None
for backup in self.backups:
if backup["timestamp"] == target_ts:
snapshot = backup
break
if snapshot is None:
return False
self.plans = {pid: dict(plan) for pid, plan in snapshot["plans"].items()}
self.subs = {}
for sid, sub in snapshot["subs"].items():
expiry = timestamp + sub["remaining_ttl"] if sub["remaining_ttl"] is not None else None
self.subs[sid] = {
"plan_id": sub["plan_id"],
"customer_id": sub["customer_id"],
"status": sub["status"],
"expiry": expiry,
}
self.history = defaultdict(list, {cid: list(records) for cid, records in snapshot["history"].items()})
self.migrated_plans = dict(snapshot["migrated_plans"])
return True
def get_payment_history(self, customer_id): # customer history 獨立 tracking
return list(self.history[customer_id])
def migrate_plan(self, timestamp, from_plan, to_plan): # 將舊 plan 用家搬去新 plan
self._purge_expired(timestamp)
if to_plan not in self.plans:
return False
self.migrated_plans[from_plan] = to_plan
for sub in self.subs.values():
if sub["plan_id"] == from_plan:
sub["plan_id"] = to_plan
return True
self.backups = [
{
"timestamp": 5000,
"plans": {"basic": {"price_per_month": 10}},
"subs": {"sub1": {"plan_id": "basic", "customer_id": "custA", "status": "ACTIVE", "remaining_ttl": 1200}},
"history": {"custA": [{"subscription_id": "sub1", "amount": 10, "timestamp": 5000}]},
"migrated_plans": {"legacy-basic": "basic"},
},
]
Step 1:backup 時 expiry 轉 remaining_ttl
Step 2:restore 時按 restore 當刻重算 expiry
Step 3:payment history 唔係 subs key,要獨立用 self.history[customer_id]
Step 4:migrate_plan 會同時改 mapping 同現有 sub["plan_id"]
async def batch_ops(self, timestamp, operations): # subscribe / cancel / pause / resume 批量處理
results = [None] * len(operations)
async def run_op(idx, op):
op_type = op["type"]
sub_id = op.get("subscription_id")
async with self.locks[sub_id]: # 單 sub op 用 sub_id 上鎖
if op_type == "subscribe":
results[idx] = self.subscribe(timestamp, op["plan_id"], sub_id, op["customer_id"], ttl_ms=op.get("ttl_ms"))
elif op_type == "cancel":
results[idx] = self.cancel(timestamp, sub_id)
elif op_type == "pause":
results[idx] = self.pause(timestamp, sub_id)
elif op_type == "resume":
results[idx] = self.resume(timestamp, sub_id)
else:
results[idx] = False
await asyncio.gather(*[run_op(i, op) for i, op in enumerate(operations)])
return results
async def transfer_subscription(self, timestamp, subscription_id, new_plan_id): # 將一個 sub 轉去另一個 plan
resolved = self._resolve_plan(new_plan_id)
if resolved not in self.plans:
return False
async with self.locks[subscription_id]:
self._purge_expired(timestamp)
if subscription_id not in self.subs:
return False
sub = self.subs[subscription_id]
if sub["status"] == "CANCELLED":
return False
sub["plan_id"] = resolved
return True
self.subs = {
"sub1": {"plan_id": "basic", "status": "ACTIVE"},
"sub2": {"plan_id": "pro", "status": "PAUSED"},
}
self.locks = {
"sub1": <asyncio.Lock>,
"sub2": <asyncio.Lock>,
}
Step 1:大部分 op 都係單 subscription,所以鎖 sub_id
Step 2:batch 用 gather 跑,但每個 sub 自己排隊
Step 3:transfer 唔使 pair-lock,因為只改一個 sub record 個 plan_id
async def process_payments(self, timestamp, subscription_ids, max_concurrent): # fail-fast + sem 收款
self._purge_expired(timestamp)
sem = asyncio.Semaphore(max_concurrent)
results = [None] * len(subscription_ids)
async def process_one(idx, sub_id):
if sub_id not in self.subs: # 唔存在即刻 fail
results[idx] = (sub_id, "PAYMENT_FAILED")
return
sub = self.subs[sub_id]
if sub["status"] == "CANCELLED": # CANCELLED 唔排隊
results[idx] = (sub_id, "PAYMENT_FAILED")
return
if sub["status"] != "ACTIVE": # PAUSED 一樣唔收錢
results[idx] = (sub_id, "PAYMENT_FAILED")
return
async with sem:
await asyncio.sleep(0.01) # 模擬 network payment
plan = self.plans.get(sub["plan_id"], {})
amount = plan.get("price_per_month", 0)
self.history[sub["customer_id"]].append({ # 成功後記付款 history
"subscription_id": sub_id,
"plan_id": sub["plan_id"],
"amount": amount,
"timestamp": timestamp,
})
results[idx] = (sub_id, "PAYMENT_SUCCESS")
await asyncio.gather(*[process_one(i, sid) for i, sid in enumerate(subscription_ids)])
return results
持久 state:
self.subs = {
"sub1": {"customer_id": "custA", "status": "ACTIVE", "plan_id": "basic"},
"sub2": {"customer_id": "custB", "status": "CANCELLED", "plan_id": "pro"},
}
self.history = defaultdict(list)
臨時 runtime:
sem = asyncio.Semaphore(max_concurrent)
Step 1:唔存在 / CANCELLED / PAUSED 都即刻 fail-fast
Step 2:只有 ACTIVE 先入 sem + sleep
Step 3:成功收錢後記入 self.history[customer_id]
Practice mock。Family 3:fleets 同 vehicles 兩份 flat dict。因為 decommission(vehicle_id) 同 get_vehicle(vehicle_id) 都係直打 vehicle,所以車一定要平放喺 B dict。L4 重點係 merged_fleets,L6 係標準 fail-fast + semaphore inspect flow。
import asyncio
from collections import defaultdict
class FleetTracker:
def __init__(self):
self.fleets = {} # fleet_id -> {"region": str, "history": [vehicle_id, ...]}
self.vehicles = {} # vehicle_id -> {"fleet_id", "mileage", "status", "expiry"}
self.backups = [] # L4 snapshot list
self.merged_fleets = {} # old_fleet -> new_fleet
self.locks = defaultdict(asyncio.Lock) # L5 per-vehicle lock
def __init__(self):
self.fleets = {}
self.vehicles = {}
self.backups = []
self.merged_fleets = {}
self.locks = defaultdict(asyncio.Lock)
self.fleets = {
"fleet_west": {"region": "west", "history": ["truck7", "van2"]},
}
self.vehicles = {
"truck7": {"fleet_id": "fleet_west", "mileage": 42000, "status": "ACTIVE", "expiry": 8100},
}
self.merged_fleets = {"fleet_old": "fleet_west"}
def register_fleet(self, timestamp, fleet_id, region): # 開一條新 fleet
if fleet_id in self.fleets:
return False
self.fleets[fleet_id] = {"region": region, "history": []} # A dict 存 fleet metadata
return True
def add_vehicle(self, timestamp, fleet_id, vehicle_id, mileage, ttl_ms=None): # 將車加落 fleet
if fleet_id not in self.fleets or vehicle_id in self.vehicles:
return False
expiry = timestamp + ttl_ms if ttl_ms is not None else None
self.vehicles[vehicle_id] = { # B dict flat 存車
"fleet_id": fleet_id,
"mileage": mileage,
"status": "ACTIVE",
"expiry": expiry,
}
self.fleets[fleet_id]["history"].append(vehicle_id) # 記低呢架車曾經屬於過呢條 fleet
return True
def decommission(self, timestamp, vehicle_id): # 只帶 vehicle_id 直接停車
self._purge_expired(timestamp)
vehicle = self.vehicles.get(vehicle_id)
if vehicle is None or vehicle["status"] == "DECOMMISSIONED":
return False
vehicle["status"] = "DECOMMISSIONED"
return True
self.fleets = {
"fleet_west": {"region": "west", "history": ["truck7"]},
}
self.vehicles = {
"truck7": {"fleet_id": "fleet_west", "mileage": 42000, "status": "ACTIVE", "expiry": None},
}
Step 1:register_fleet 起 A
Step 2:add_vehicle 起 flat B
Step 3:decommission 只打 vehicle_id,所以 vehicles 一定要 flat
def get_fleet_size(self, timestamp, fleet_id): # 數某條 fleet 幾多架 ACTIVE 車
self._purge_expired(timestamp)
if fleet_id not in self.fleets:
return None
count = 0
for vehicle in self.vehicles.values():
if vehicle["fleet_id"] == fleet_id and vehicle["status"] == "ACTIVE":
count += 1
return count
def get_vehicle(self, timestamp, vehicle_id): # 攞一架車資料
vehicle = self.vehicles.get(vehicle_id)
if vehicle is None:
return None
if vehicle["expiry"] is not None and vehicle["status"] == "ACTIVE" and timestamp >= vehicle["expiry"]:
vehicle["status"] = "DECOMMISSIONED" # 單點查詢都會順手補做 expiry
return dict(vehicle)
def get_highest_mileage(self, timestamp, n): # top n mileage,active only
self._purge_expired(timestamp)
active = [
(vehicle_id, vehicle["mileage"])
for vehicle_id, vehicle in self.vehicles.items()
if vehicle["status"] == "ACTIVE"
]
active.sort(key=lambda x: (-x[1], x[0])) # mileage desc,tie-break vehicle_id asc
return active[:n]
self.vehicles = {
"truck7": {"fleet_id": "fleet_west", "mileage": 42000, "status": "ACTIVE"},
"van2": {"fleet_id": "fleet_west", "mileage": 18000, "status": "ACTIVE"},
"car1": {"fleet_id": "fleet_east", "mileage": 51000, "status": "ACTIVE"},
}
Step 1:fleet_id 唔係 B dict key,所以要掃 vehicles 聚合
Step 2:top N 先 filter ACTIVE,再按 (-mileage, vehicle_id) sort
Step 3:get_vehicle 係 direct lookup,屬於 Family 3 flat-B 典型題
def _purge_expired(self, timestamp): # 掃全部車,過期就標 DECOMMISSIONED
for vehicle in self.vehicles.values():
if vehicle["expiry"] is not None and vehicle["status"] == "ACTIVE" and timestamp >= vehicle["expiry"]:
vehicle["status"] = "DECOMMISSIONED"
def _is_active(self, vehicle_id, timestamp): # 查一架車仲 active 唔 active
vehicle = self.vehicles.get(vehicle_id)
if vehicle is None:
return False
if vehicle["expiry"] is not None and vehicle["status"] == "ACTIVE" and timestamp >= vehicle["expiry"]:
vehicle["status"] = "DECOMMISSIONED" # 單點 check 都會補做 expiry
return vehicle["status"] == "ACTIVE"
self.vehicles = {
"truck7": {"fleet_id": "fleet_west", "status": "ACTIVE", "expiry": 8100},
"van2": {"fleet_id": "fleet_west", "status": "DECOMMISSIONED", "expiry": 5000},
}
Step 1:TTL 跟 vehicle 走,所以擺 B dict
Step 2:呢題過期唔 delete,係改 status = DECOMMISSIONED
Step 3:bulk purge 同單點 _is_active 都要識補做 expiry
def backup(self, timestamp): # 影成個 fleet system snapshot
self._purge_expired(timestamp)
fleets_copy = {
fleet_id: {"region": fleet["region"], "history": list(fleet["history"])}
for fleet_id, fleet in self.fleets.items()
}
vehicles_copy = {}
for vehicle_id, vehicle in self.vehicles.items():
remaining = vehicle["expiry"] - timestamp if vehicle["expiry"] is not None and vehicle["status"] == "ACTIVE" else None
vehicles_copy[vehicle_id] = {
"fleet_id": vehicle["fleet_id"],
"mileage": vehicle["mileage"],
"status": vehicle["status"],
"remaining_ttl": remaining,
}
self.backups.append({
"timestamp": timestamp,
"fleets": fleets_copy,
"vehicles": vehicles_copy,
"merged_fleets": dict(self.merged_fleets),
})
return len(self.backups)
def restore(self, timestamp, target_ts): # 還原指定 backup timestamp
snapshot = None
for backup in self.backups:
if backup["timestamp"] == target_ts:
snapshot = backup
break
if snapshot is None:
return False
self.fleets = {fid: {"region": fleet["region"], "history": list(fleet["history"])} for fid, fleet in snapshot["fleets"].items()}
self.vehicles = {}
for vehicle_id, vehicle in snapshot["vehicles"].items():
expiry = timestamp + vehicle["remaining_ttl"] if vehicle["remaining_ttl"] is not None else None
self.vehicles[vehicle_id] = {
"fleet_id": vehicle["fleet_id"],
"mileage": vehicle["mileage"],
"status": vehicle["status"],
"expiry": expiry,
}
self.merged_fleets = dict(snapshot["merged_fleets"])
return True
def get_vehicle_history(self, timestamp, fleet_id): # 歷史跟 fleet 走,但要先 resolve merge chain
resolved = self._resolve_fleet(fleet_id)
if resolved not in self.fleets:
return None
return list(self.fleets[resolved]["history"])
def merge_fleet(self, timestamp, from_fleet_id, to_fleet_id): # 將 ACTIVE 車搬去另一條 fleet
self._purge_expired(timestamp)
if from_fleet_id not in self.fleets or to_fleet_id not in self.fleets:
return False
for vehicle_id, vehicle in self.vehicles.items():
if vehicle["fleet_id"] == from_fleet_id and vehicle["status"] == "ACTIVE":
vehicle["fleet_id"] = to_fleet_id
self.fleets[to_fleet_id]["history"].append(vehicle_id)
self.merged_fleets[from_fleet_id] = to_fleet_id
return True
self.backups = [
{
"timestamp": 5000,
"fleets": {"fleet_west": {"region": "west", "history": ["truck7", "van2"]}},
"vehicles": {"truck7": {"fleet_id": "fleet_west", "mileage": 42000, "status": "ACTIVE", "remaining_ttl": 1200}},
"merged_fleets": {"fleet_old": "fleet_west"},
},
]
self.merged_fleets = {"fleet_old": "fleet_west"}
Step 1:backup 時 vehicle expiry 轉 remaining_ttl
Step 2:history 跟 fleet 走,所以放 A dict
Step 3:merge 主要改 vehicle["fleet_id"],再記 merged_fleets chain
async def batch_ops(self, timestamp, operations): # add / decommission 批量處理
results = []
for op in operations:
if op["type"] == "add":
vehicle_id = op["vehicle_id"]
async with self.locks[vehicle_id]: # 單車 op 鎖 vehicle_id
result = self.add_vehicle(timestamp, op["fleet_id"], vehicle_id, op["mileage"], ttl_ms=op.get("ttl_ms"))
results.append(result)
elif op["type"] == "decommission":
vehicle_id = op["vehicle_id"]
async with self.locks[vehicle_id]:
results.append(self.decommission(timestamp, vehicle_id))
else:
results.append(False)
return results
async def transfer_vehicle(self, timestamp, vehicle_id, new_fleet_id): # 將一架 ACTIVE 車轉去另一條 fleet
self._purge_expired(timestamp)
async with self.locks[vehicle_id]:
vehicle = self.vehicles.get(vehicle_id)
if vehicle is None or vehicle["status"] != "ACTIVE":
return False
if new_fleet_id not in self.fleets or vehicle["fleet_id"] == new_fleet_id:
return False
vehicle["fleet_id"] = new_fleet_id
self.fleets[new_fleet_id]["history"].append(vehicle_id)
return True
self.vehicles = {
"truck7": {"fleet_id": "fleet_west", "status": "ACTIVE"},
"van2": {"fleet_id": "fleet_east", "status": "ACTIVE"},
}
self.locks = {
"truck7": <asyncio.Lock>,
"van2": <asyncio.Lock>,
}
Step 1:單車操作都係圍住 vehicle_id,Pattern A 足夠
Step 2:batch 入面逐條鎖住對應 vehicle,再 call 舊 sync method
Step 3:transfer 都只改一架車記錄,唔使 pair-lock
async def inspect_vehicles(self, timestamp, vehicle_ids, max_concurrent): # fail-fast + semaphore 檢車
self._purge_expired(timestamp)
for vehicle_id in vehicle_ids: # Phase 1:先驗證
vehicle = self.vehicles.get(vehicle_id)
if vehicle is None:
return {"error": f"Vehicle {vehicle_id} not found"} # 唔存在即刻 fail
if vehicle["status"] == "DECOMMISSIONED":
return {"error": f"Vehicle {vehicle_id} is DECOMMISSIONED"} # 已退役一樣唔做
sem = asyncio.Semaphore(max_concurrent) # Phase 2:過關先限流
results = {}
async def _inspect_one(vehicle_id):
async with sem:
await asyncio.sleep(0) # 模擬 inspection 工作
results[vehicle_id] = "INSPECTED"
tasks = [asyncio.create_task(_inspect_one(vehicle_id)) for vehicle_id in vehicle_ids]
await asyncio.gather(*tasks)
return results
持久 state:
self.vehicles = {
"truck7": {"status": "ACTIVE"},
"van2": {"status": "DECOMMISSIONED"},
}
self.locks = defaultdict(asyncio.Lock)
臨時 runtime:
sem = asyncio.Semaphore(max_concurrent)
Step 1:missing / DECOMMISSIONED 先 fail-fast
Step 2:全部 valid 先開 sem + sleep
Step 3:成功後回傳每架車 INSPECTED
Practice mock。Family 3:nodes 係 A,objects 係 B。重點係 L4 容量不足時要用 while loop 逐個踢走 LRU object,L5 有 single-lock batch 同 pair-lock replicate,L6 係 thread semaphore 批量 purge node。
import threading
from collections import defaultdict
class CDNEdgeCache:
def __init__(self):
self.nodes = {} # node_id -> {"storage_limit_mb": N, "history": [...]}
self.objects = {} # object_id -> {"node_id", "size_mb", "access_time", "expiry"}
self._locks = defaultdict(threading.Lock) # L5 per-node lock
self._meta_lock = threading.Lock() # 保護 _locks dict 建立過程
def __init__(self):
self.nodes = {}
self.objects = {}
self._locks = defaultdict(threading.Lock)
self._meta_lock = threading.Lock()
self.nodes = {
"edge-a": {"storage_limit_mb": 500, "history": [{"action": "register", "timestamp": 100}]},
}
self.objects = {
"logo.png": {"node_id": "edge-a", "size_mb": 12, "access_time": 1000, "expiry": 5100},
}
self._locks = {"edge-a": <threading.Lock>}
def register_node(self, timestamp, node_id, storage_limit_mb): # 開一個新 edge node
self._purge(timestamp)
if node_id in self.nodes:
return {"status": "error", "message": f"Node {node_id} already exists"}
self.nodes[node_id] = {
"storage_limit_mb": storage_limit_mb,
"history": [{"action": "register", "timestamp": timestamp, "storage_limit_mb": storage_limit_mb}],
}
return {"status": "ok", "node_id": node_id}
def cache_object(self, timestamp, node_id, object_id, size_mb, ttl_ms=None): # 將 object 放落 node
self._purge(timestamp)
if node_id not in self.nodes:
return {"status": "error", "message": f"Node {node_id} not found"}
expiry = (timestamp + ttl_ms) if ttl_ms is not None else None
if object_id in self.objects:
del self.objects[object_id] # 同 id 重 cache 就先覆蓋舊 object
self._evict_lru_from(timestamp, node_id, size_mb) # L4:容量唔夠就 while 踢 LRU
usage = self._calc_usage(node_id)
limit = self.nodes[node_id]["storage_limit_mb"]
if usage + size_mb > limit:
return {"status": "error", "message": f"Not enough space on {node_id} even after eviction"}
self.objects[object_id] = {"node_id": node_id, "size_mb": size_mb, "access_time": timestamp, "expiry": expiry}
self.nodes[node_id]["history"].append({"action": "cache", "timestamp": timestamp, "object_id": object_id, "size_mb": size_mb, "ttl_ms": ttl_ms})
return {"status": "ok", "object_id": object_id, "node_id": node_id}
def invalidate(self, timestamp, object_id): # 手動令 object 失效
self._purge(timestamp)
if object_id not in self.objects:
return {"status": "error", "message": f"Object {object_id} not found"}
node_id = self.objects[object_id]["node_id"]
del self.objects[object_id]
return {"status": "ok", "object_id": object_id, "node_id": node_id}
def get_node_usage(self, timestamp, node_id): # 查 node 用量
self._purge(timestamp)
if node_id not in self.nodes:
return {"status": "error", "message": f"Node {node_id} not found"}
usage = self._calc_usage(node_id)
limit = self.nodes[node_id]["storage_limit_mb"]
return {"status": "ok", "node_id": node_id, "used_mb": usage, "limit_mb": limit, "free_mb": limit - usage}
self.nodes = {
"edge-a": {"storage_limit_mb": 500, "history": []},
}
self.objects = {
"logo.png": {"node_id": "edge-a", "size_mb": 12, "access_time": 1000, "expiry": None},
}
Step 1:register_node 起 A
Step 2:cache_object 起 flat B
Step 3:invalidate / get_object_location 只帶 object_id,所以 B 一定要 flat
def get_object_location(self, timestamp, object_id): # 查 object 而家喺邊個 node
self._purge(timestamp)
if object_id not in self.objects:
return {"status": "error", "message": f"Object {object_id} not found"}
obj = self.objects[object_id]
obj["access_time"] = timestamp # 讀取都算 access,會影響 LRU
return {"status": "ok", "object_id": object_id, "node_id": obj["node_id"], "size_mb": obj["size_mb"], "expiry": obj["expiry"]}
def get_largest_nodes(self, timestamp, n): # top n 用量最大 node
self._purge(timestamp)
rows = []
for node_id in self.nodes:
rows.append({
"node_id": node_id,
"used_mb": self._calc_usage(node_id), # usage 要掃 B dict 加總
"limit_mb": self.nodes[node_id]["storage_limit_mb"],
})
rows.sort(key=lambda x: x["used_mb"], reverse=True)
return {"status": "ok", "nodes": rows[:n]}
self.objects = {
"logo.png": {"node_id": "edge-a", "size_mb": 12, "access_time": 1000},
"video.mp4": {"node_id": "edge-a", "size_mb": 90, "access_time": 950},
"hero.jpg": {"node_id": "edge-b", "size_mb": 40, "access_time": 990},
}
Step 1:object lookup 係 direct by object_id
Step 2:node ranking 要掃 objects 聚合每個 node 用量
Step 3:get_object_location 會順手更新 access_time,為 LRU 做準備
def _purge(self, timestamp): # 掃走所有過期 object
expired_ids = [
object_id for object_id, obj in self.objects.items()
if obj["expiry"] is not None and obj["expiry"] <= timestamp
]
for object_id in expired_ids:
node_id = self.objects[object_id]["node_id"]
if node_id in self.nodes:
self.nodes[node_id]["history"].append({
"action": "auto_purge",
"timestamp": timestamp,
"object_id": object_id,
"size_mb": self.objects[object_id]["size_mb"],
})
del self.objects[object_id] # 呢題過期係直接 delete object
# TTL 寫喺 L1 cache_object 入面:
# expiry = timestamp + ttl_ms if ttl_ms is not None else None
self.objects = {
"logo.png": {"node_id": "edge-a", "size_mb": 12, "access_time": 1000, "expiry": 5100},
"hero.jpg": {"node_id": "edge-b", "size_mb": 40, "access_time": 990, "expiry": None},
}
Step 1:TTL 跟 object 走,所以擺 B dict
Step 2:過期後直接 delete object
Step 3:purge 前記返 auto_purge history 落 node
def _evict_lru_from(self, timestamp, node_id, needed_mb): # while loop 逐個踢最舊 object
while True:
usage = self._calc_usage(node_id)
limit = self.nodes[node_id]["storage_limit_mb"]
if usage + needed_mb <= limit:
break # 騰夠位就停
node_objects = [
(object_id, obj) for object_id, obj in self.objects.items()
if obj["node_id"] == node_id
]
if not node_objects:
break
lru_id, lru_obj = min(node_objects, key=lambda x: x[1]["access_time"])# access_time 最細 = 最舊
self.nodes[node_id]["history"].append({"action": "evict_lru", "timestamp": timestamp, "object_id": lru_id, "size_mb": lru_obj["size_mb"]})
del self.objects[lru_id]
def backup(self, timestamp, node_id): # 備份單一 node 所有 object
self._purge(timestamp)
if node_id not in self.nodes:
return {"status": "error", "message": f"Node {node_id} not found"}
snapshot = []
for object_id, obj in self.objects.items():
if obj["node_id"] != node_id:
continue
remaining = obj["expiry"] - timestamp if obj["expiry"] is not None else None
snapshot.append({"object_id": object_id, "size_mb": obj["size_mb"], "access_time": obj["access_time"], "remaining_ttl": remaining})
return {"status": "ok", "node_id": node_id, "snapshot": snapshot}
def restore(self, timestamp, node_id, snapshot): # 根據 remaining_ttl 還原 object
self._purge(timestamp)
if node_id not in self.nodes:
return {"status": "error", "message": f"Node {node_id} not found"}
restored = 0
skipped = 0
for item in snapshot:
expiry = timestamp + item["remaining_ttl"] if item.get("remaining_ttl") is not None else None
self._evict_lru_from(timestamp, node_id, item["size_mb"]) # restore 前都可能要先騰位
if self._calc_usage(node_id) + item["size_mb"] > self.nodes[node_id]["storage_limit_mb"]:
skipped += 1
continue
self.objects[item["object_id"]] = {"node_id": node_id, "size_mb": item["size_mb"], "access_time": item.get("access_time", timestamp), "expiry": expiry}
restored += 1
return {"status": "ok", "restored": restored, "skipped": skipped}
def set_storage_limit(self, timestamp, node_id, new_limit_mb): # 動態縮細容量都要 while loop 踢 LRU
self._purge(timestamp)
old_limit = self.nodes[node_id]["storage_limit_mb"]
self.nodes[node_id]["storage_limit_mb"] = new_limit_mb
self._evict_lru_from(timestamp, node_id, 0) # 唔係加新 object,所以 needed_mb = 0
return {"status": "ok", "node_id": node_id, "old_limit_mb": old_limit, "new_limit_mb": new_limit_mb}
self.nodes = {
"edge-a": {"storage_limit_mb": 500, "history": [{"action": "evict_lru", "object_id": "old.css"}]},
}
self.objects = {
"hero.jpg": {"node_id": "edge-a", "size_mb": 40, "access_time": 990, "expiry": 6200},
}
snapshot = [
{"object_id": "hero.jpg", "size_mb": 40, "access_time": 990, "remaining_ttl": 1200},
]
Step 1:LRU 核心係 while loop,一次踢一件最舊 object
Step 2:backup 時 expiry 轉 remaining_ttl
Step 3:restore / set_storage_limit 都可能再用同一條 while loop 騰位
def batch_ops(self, timestamp, operations): # cache / invalidate 批量做
self._purge(timestamp)
involved_nodes = set()
for op in operations: # 先收集所有會郁到嘅 node
if op["op"] == "cache":
involved_nodes.add(op["node_id"])
elif op["op"] == "invalidate" and op["object_id"] in self.objects:
involved_nodes.add(self.objects[op["object_id"]]["node_id"])
locks = [self._get_lock(node_id) for node_id in sorted(involved_nodes)] # sorted node lock,避免 deadlock
for lock in locks:
lock.acquire()
try:
results = []
for op in operations:
if op["op"] == "cache":
results.append(self.cache_object(timestamp, op["node_id"], op["object_id"], op["size_mb"], ttl_ms=op.get("ttl_ms")))
elif op["op"] == "invalidate":
results.append(self.invalidate(timestamp, op["object_id"]))
else:
results.append({"status": "error", "message": f"Unknown op: {op['op']}"})
return {"status": "ok", "results": results}
finally:
for lock in reversed(locks):
lock.release()
def replicate(self, timestamp, object_id, from_node, to_node, ttl_ms=None): # 複製 object 去另一個 node
self._purge(timestamp)
sorted_nodes = sorted([from_node, to_node]) # pair-lock:兩個 node 一定按字母序拎鎖
locks = [self._get_lock(node_id) for node_id in sorted_nodes]
for lock in locks:
lock.acquire()
try:
obj = self.objects[object_id]
replica_id = f"{object_id}@{to_node}"
expiry = timestamp + ttl_ms if ttl_ms is not None else obj["expiry"]
self._evict_lru_from(timestamp, to_node, obj["size_mb"])
self.objects[replica_id] = {"node_id": to_node, "size_mb": obj["size_mb"], "access_time": timestamp, "expiry": expiry}
return {"status": "ok", "replica_id": replica_id, "from_node": from_node, "to_node": to_node}
finally:
for lock in reversed(locks):
lock.release()
self._locks = {
"edge-a": <threading.Lock>,
"edge-b": <threading.Lock>,
}
self.objects = {
"logo.png": {"node_id": "edge-a", "size_mb": 12},
}
Step 1:單 node batch 係 Pattern A
Step 2:replicate 會同時掂 source + target node,所以係 pair-lock
Step 3:一定 sorted node_id 後先 acquire,避免 deadlock
def purge_nodes(self, timestamp, node_ids, max_concurrent): # 批量清空多個 node
self._purge(timestamp)
for node_id in node_ids: # Phase 1:fail-fast
if node_id not in self.nodes:
return {"status": "error", "message": f"Node {node_id} not found — fail-fast, no nodes purged"}
semaphore = threading.Semaphore(max_concurrent) # Phase 2:thread semaphore
results = {}
result_lock = threading.Lock()
def _purge_single_node(node_id): # 清空單一 node
with semaphore:
to_remove = [object_id for object_id, obj in self.objects.items() if obj["node_id"] == node_id]
removed_count = 0
freed_mb = 0.0
for object_id in to_remove:
if object_id in self.objects:
freed_mb += self.objects[object_id]["size_mb"]
del self.objects[object_id]
removed_count += 1
self.nodes[node_id]["history"].append({"action": "purge_node", "timestamp": timestamp, "removed_count": removed_count, "freed_mb": freed_mb})
with result_lock:
results[node_id] = {"removed_count": removed_count, "freed_mb": freed_mb}
threads = []
for node_id in node_ids:
thread = threading.Thread(target=_purge_single_node, args=(node_id,))
threads.append(thread)
thread.start()
for thread in threads:
thread.join()
return {"status": "ok", "purged": results}
持久 state:
self.nodes = {
"edge-a": {"history": [{"action": "purge_node", "removed_count": 2, "freed_mb": 40.0}]},
}
self.objects = {
"hero.jpg": {"node_id": "edge-b", "size_mb": 40},
}
臨時 runtime:
semaphore = threading.Semaphore(max_concurrent)
result_lock = threading.Lock()
Step 1:missing node 先 fail-fast,唔好 purge 一半
Step 2:過關先每個 node 開 thread
Step 3:thread 入面用 semaphore 限制同時最多 N 個 node 做緊 purge
呢頁對應你新加嗰份 practice-mocks/compliance_audit.py。Family 3:policies 同 violations 兩份 flat dict,重點係 entity history 要獨立 track,L4 merge 會改 entity ownership,L5 有 transfer pair-lock,L6 係 fail-fast + semaphore 匯報 violation。
import asyncio
import copy
from collections import defaultdict
class ComplianceAuditEngine:
def __init__(self):
self.policies = {} # policy_id -> {"description": str, "max_violations": int}
self.violations = {} # violation_id -> {"policy_id", "entity_id", "severity", "status", "expiry"}
self.history = defaultdict(list) # entity_id -> [violation_id, ...]
self.merged_entities = {} # old_entity -> new_entity
self.backups = [] # L4 backup list
self.locks = defaultdict(asyncio.Lock)# L5 per-key lock
def __init__(self):
self.policies = {}
self.violations = {}
self.history = defaultdict(list)
self.merged_entities = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
self.policies = {
"p_late": {"description": "late filing", "max_violations": 3},
}
self.violations = {
"v1": {"policy_id": "p_late", "entity_id": "ent_a", "severity": 4, "status": "ACTIVE", "expiry": 8100},
}
self.history = {"ent_a": ["v1", "v2"]}
self.merged_entities = {"ent_old": "ent_a"}
def register_policy(self, timestamp, policy_id, description, max_violations): # 開一條 policy
if policy_id in self.policies:
return False
self.policies[policy_id] = {
"description": description,
"max_violations": max_violations,
}
return True
def flag_violation(self, timestamp, policy_id, entity_id, violation_id, severity, ttl_ms=0): # 新增 violation
self._purge(timestamp)
entity_id = self._resolve_entity(entity_id) # merge 過嘅 entity 要先追鏈
if policy_id not in self.policies or violation_id in self.violations:
return False
max_v = self.policies[policy_id]["max_violations"]
if self._count_active_for_entity_policy(entity_id, policy_id) >= max_v:
return False # 已經去到上限唔畀再記
expiry = (timestamp + ttl_ms) if ttl_ms > 0 else None
self.violations[violation_id] = {
"policy_id": policy_id,
"entity_id": entity_id,
"severity": severity,
"status": "ACTIVE",
"expiry": expiry,
}
self.history[entity_id].append(violation_id) # independent history 喺呢度記
return True
def clear_violation(self, timestamp, violation_id): # ACTIVE -> CLEARED
self._purge(timestamp)
if violation_id not in self.violations:
return False
violation = self.violations[violation_id]
if violation["status"] != "ACTIVE":
return False
violation["status"] = "CLEARED"
return True
def get_active_violations(self, timestamp, entity_id): # 數某 entity 跨 policy ACTIVE 幾多單
self._purge(timestamp)
entity_id = self._resolve_entity(entity_id)
count = 0
for violation in self.violations.values():
if violation["entity_id"] == entity_id and violation["status"] == "ACTIVE":
count += 1
return count
self.policies = {
"p_late": {"description": "late filing", "max_violations": 3},
}
self.violations = {
"v1": {"policy_id": "p_late", "entity_id": "ent_a", "severity": 4, "status": "ACTIVE", "expiry": None},
}
Step 1:A dict 開 policy
Step 2:B dict flat 開 violation
Step 3:flag 前要 check policy existence / violation_id uniqueness / entity cap
def get_worst_entities(self, timestamp, n): # top n entity by ACTIVE violation count
self._purge(timestamp)
entity_counts = defaultdict(int)
for violation in self.violations.values():
if violation["status"] == "ACTIVE":
entity_counts[violation["entity_id"]] += 1
ranked = sorted(entity_counts.items(), key=lambda x: (-x[1], x[0])) # count desc,再 entity asc
return ranked[:n]
def find_violation(self, timestamp, violation_id): # direct lookup by violation_id
self._purge(timestamp)
if violation_id not in self.violations:
return None
violation = self.violations[violation_id]
return (violation["policy_id"], violation["entity_id"])
self.violations = {
"v1": {"policy_id": "p_late", "entity_id": "ent_a", "status": "ACTIVE"},
"v2": {"policy_id": "p_late", "entity_id": "ent_a", "status": "ACTIVE"},
"v3": {"policy_id": "p_tax", "entity_id": "ent_b", "status": "ACTIVE"},
}
Step 1:find_violation 係典型 flat-B direct lookup
Step 2:worst_entities 要掃晒 violations 再聚合 entity count
Step 3:sort key 係 (-count, entity_id)
def _purge(self, timestamp): # 掃 violations,過期就改 EXPIRED
for violation in self.violations.values():
if violation["status"] == "ACTIVE" and violation["expiry"] is not None and violation["expiry"] <= timestamp:
violation["status"] = "EXPIRED" # 呢題過期唔 delete,方便審計留痕
# TTL 已經整合喺 flag_violation 入面:
# expiry = (timestamp + ttl_ms) if ttl_ms > 0 else None
self.violations = {
"v1": {"entity_id": "ent_a", "status": "ACTIVE", "expiry": 8100},
"v2": {"entity_id": "ent_a", "status": "EXPIRED", "expiry": 5000},
}
Step 1:TTL 跟 violation 走,所以擺 B dict
Step 2:過期後唔 delete,改 status = EXPIRED
Step 3:因為要做 audit,留住舊 violation 先有歷史可查
def backup(self, timestamp): # 影低 policies / violations / history / merged_entities
self._purge(timestamp)
violations_snap = {}
for violation_id, violation in self.violations.items():
row = copy.deepcopy(violation)
row["remaining_ttl"] = (row["expiry"] - timestamp) if row["expiry"] is not None and row["status"] == "ACTIVE" else None
del row["expiry"]
violations_snap[violation_id] = row
snap = {
"timestamp": timestamp,
"policies": copy.deepcopy(self.policies),
"violations": violations_snap,
"history": copy.deepcopy(dict(self.history)),
"merged_entities": copy.deepcopy(self.merged_entities),
}
self.backups.append(snap)
return len(self.backups) - 1
def restore(self, timestamp, backup_idx): # 用 backup index 還原
if backup_idx < 0 or backup_idx >= len(self.backups):
return False
snap = self.backups[backup_idx]
self.policies = copy.deepcopy(snap["policies"])
self.merged_entities = copy.deepcopy(snap["merged_entities"])
self.history = defaultdict(list, copy.deepcopy(snap["history"]))
self.violations = {}
for violation_id, row in snap["violations"].items():
restored = copy.deepcopy(row)
remaining = restored.pop("remaining_ttl")
restored["expiry"] = timestamp + remaining if remaining is not None and restored["status"] == "ACTIVE" else None
self.violations[violation_id] = restored
return True
def get_violation_history(self, timestamp, entity_id): # history 單獨按 entity_id track
self._purge(timestamp)
entity_id = self._resolve_entity(entity_id)
return list(self.history.get(entity_id, []))
def merge_entity(self, timestamp, from_entity, to_entity): # 將舊 entity 違規搬去新 entity
self._purge(timestamp)
from_entity = self._resolve_entity(from_entity)
to_entity = self._resolve_entity(to_entity)
if from_entity == to_entity:
return False
for violation in self.violations.values():
if violation["entity_id"] == from_entity:
violation["entity_id"] = to_entity
if from_entity in self.history:
self.history[to_entity].extend(self.history[from_entity])
del self.history[from_entity]
self.merged_entities[from_entity] = to_entity
return True
self.backups = [
{
"timestamp": 5000,
"policies": {"p_late": {"description": "late filing", "max_violations": 3}},
"violations": {"v1": {"policy_id": "p_late", "entity_id": "ent_a", "severity": 4, "status": "ACTIVE", "remaining_ttl": 1200}},
"history": {"ent_a": ["v1", "v2"]},
"merged_entities": {"ent_old": "ent_a"},
},
]
Step 1:backup 時 violation expiry 轉 remaining_ttl
Step 2:history 唔係 violations key,所以要獨立 self.history[entity_id]
Step 3:merge_entity 會同時改 violation owner 同 history chain
async def batch_audit(self, timestamp, operations): # flag / clear / transfer 批量處理
results = []
for op in operations:
if op["type"] == "flag":
violation_id = op["violation_id"]
async with self.locks[violation_id]: # Pattern A:單 violation lock
results.append(self.flag_violation(timestamp, op["policy_id"], op["entity_id"], violation_id, op["severity"], op.get("ttl_ms", 0)))
elif op["type"] == "clear":
violation_id = op["violation_id"]
async with self.locks[violation_id]:
results.append(self.clear_violation(timestamp, violation_id))
elif op["type"] == "transfer":
violation_id = op["violation_id"]
to_entity = op["to_entity"]
keys = sorted([violation_id, to_entity]) # Pattern B:violation_id + to_entity pair-lock
async with self.locks[keys[0]]:
async with self.locks[keys[1]]:
results.append(self._transfer_violation(timestamp, violation_id, to_entity))
else:
results.append(False)
return results
def _transfer_violation(self, timestamp, violation_id, to_entity): # 搬一張 ACTIVE violation 去另一個 entity
self._purge(timestamp)
if violation_id not in self.violations:
return False
violation = self.violations[violation_id]
if violation["status"] != "ACTIVE":
return False
to_entity = self._resolve_entity(to_entity)
policy_id = violation["policy_id"]
max_v = self.policies[policy_id]["max_violations"]
if self._count_active_for_entity_policy(to_entity, policy_id) >= max_v:
return False
violation["entity_id"] = to_entity
self.history[to_entity].append(violation_id)
return True
self.violations = {
"v1": {"entity_id": "ent_a", "status": "ACTIVE"},
"v2": {"entity_id": "ent_b", "status": "CLEARED"},
}
self.locks = {
"v1": <asyncio.Lock>,
"ent_a": <asyncio.Lock>,
"ent_b": <asyncio.Lock>,
}
Step 1:flag / clear 係單 violation op,所以 Pattern A
Step 2:transfer 會同時用 violation_id 同 target entity,所以係 pair-lock
Step 3:pair-lock 前一定 sorted keys,避免 deadlock
async def report_violations(self, timestamp, violation_ids, max_concurrent): # fail-fast + sem 匯報 violation
self._purge(timestamp)
for violation_id in violation_ids: # Phase 1:同步驗證
if violation_id not in self.violations:
raise ValueError(f"Violation {violation_id} does not exist")
if self.violations[violation_id]["status"] == "CLEARED":
raise ValueError(f"Violation {violation_id} is CLEARED") # CLEARED 唔應該再匯報
sem = asyncio.Semaphore(max_concurrent) # Phase 2:過關先限流
reported = []
async def _report_one(violation_id):
async with sem:
await asyncio.sleep(0.01) # 模擬 network call
self.violations[violation_id]["status"] = "REPORTED"
reported.append(violation_id)
tasks = [asyncio.create_task(_report_one(violation_id)) for violation_id in violation_ids]
await asyncio.gather(*tasks)
return reported
持久 state:
self.violations = {
"v1": {"status": "ACTIVE"},
"v2": {"status": "REPORTED"},
}
self.history = defaultdict(list)
臨時 runtime:
sem = asyncio.Semaphore(max_concurrent)
Step 1:missing / CLEARED 先 fail-fast
Step 2:ACTIVE 過關先 sem + sleep
Step 3:成功後將 status 改 REPORTED
用法:分完 Family → 開呢頁 → 抄對應嘅 template → 改 class 名、dict 名、field 名 → 搞掂。
呢 3 個 mock 冇任何 domain-specific helper(冇 _hash、冇 _ring、冇 _route)。純粹 dict + for loop + asyncio。
適用:Bank、Leaderboard、Scheduler、Session 類型。一個 entity,冇 sub-item。
import copy
import asyncio
from collections import defaultdict
class GenericF1:
def __init__(self):
self.items = {}
self.backups = []
self.merged_items = {}
self.locks = defaultdict(asyncio.Lock)
# ── L1 CRUD ──
def create(self, timestamp, item_id, field1, field2):
self._purge(timestamp)
if item_id in self.items:
return False
self.items[item_id] = {
"field1": field1,
"field2": field2,
"expiry": None,
"history": [(timestamp, field1)],
}
return True
def update(self, timestamp, item_id, field1):
self._purge(timestamp)
if item_id not in self.items:
return None
self.items[item_id]["field1"] = field1
self.items[item_id]["history"].append((timestamp, field1))
return self.items[item_id]["field1"]
def get(self, timestamp, item_id):
self._purge(timestamp)
if item_id not in self.items:
return None
return self.items[item_id]["field1"]
def delete(self, timestamp, item_id):
self._purge(timestamp)
if item_id not in self.items:
return False
del self.items[item_id]
return True
# ── L2 Sort ──
def top_n(self, timestamp, n):
self._purge(timestamp)
items = []
for item_id, info in self.items.items():
items.append((-info["field1"], item_id))
items.sort()
result = []
for val, item_id in items[:n]:
result.append(item_id)
return result
# ── L3 TTL ──
def create_with_ttl(self, timestamp, item_id, field1, field2, ttl_ms):
self._purge(timestamp)
if item_id in self.items:
return False
expiry = None
if ttl_ms is not None:
expiry = timestamp + ttl_ms
self.items[item_id] = {
"field1": field1,
"field2": field2,
"expiry": expiry,
"history": [(timestamp, field1)],
}
return True
def _purge(self, timestamp):
for item_id in list(self.items.keys()):
exp = self.items[item_id]["expiry"]
if exp is not None and timestamp >= exp:
del self.items[item_id]
# ── L4 Backup(冇 TTL 版 — Bank 款) ──
def backup(self, timestamp):
self._purge(timestamp)
snapshot = copy.deepcopy(self.items)
self.backups.append((timestamp, snapshot))
def restore(self, timestamp, target_ts):
best = None
for backup_ts, snapshot in self.backups:
if backup_ts <= target_ts:
best = (backup_ts, snapshot)
if best is None:
return False
self.items = copy.deepcopy(best[1])
return True
# ── L4 History(value at time — reversed loop) ──
def get_value_at(self, timestamp, item_id, time_at):
self._purge(timestamp)
if item_id not in self.items:
return None
for ts, val in reversed(self.items[item_id]["history"]):
if ts <= time_at:
return val
return None
# ── L4 Merge ──
def merge(self, timestamp, id1, id2):
self._purge(timestamp)
if id1 not in self.items or id2 not in self.items:
return False
if id1 == id2:
return False
self.items[id1]["field1"] += self.items[id2]["field1"]
self.items[id1]["history"].append((timestamp, self.items[id1]["field1"]))
self.merged_items[id2] = id1
del self.items[id2]
return True
# ── L5 Batch ──
async def batch(self, timestamp, operations):
self._purge(timestamp)
async def execute_op(op):
# 雙 key op
if op["type"] == "transfer":
keys = sorted([op["source_id"], op["target_id"]])
async with self.locks[keys[0]]:
async with self.locks[keys[1]]:
# transfer 邏輯
s = self.items.get(op["source_id"])
t = self.items.get(op["target_id"])
if not s or not t:
return None
amount = op["amount"]
s["field1"] -= amount
t["field1"] += amount
return s["field1"]
# 單 key op
key = op["item_id"]
async with self.locks[key]:
if op["type"] == "create":
return self.create(timestamp, op["item_id"], op["field1"], op.get("field2", 0))
elif op["type"] == "update":
return self.update(timestamp, op["item_id"], op["field1"])
elif op["type"] == "delete":
return self.delete(timestamp, op["item_id"])
return None
tasks = []
for op in operations:
tasks.append(execute_op(op))
results = await asyncio.gather(*tasks)
return list(results)
# ── L6 Rate Limited (fail-fast) ──
async def sync(self, timestamp, item_ids, max_concurrent):
self._purge(timestamp)
sem = asyncio.Semaphore(max_concurrent)
async def do_one(item_id):
if item_id not in self.items:
return False
async with sem:
await asyncio.sleep(0.01)
return True
tasks = []
for item_id in item_ids:
tasks.append(do_one(item_id))
results = await asyncio.gather(*tasks)
return list(results)
適用:InMemDB、DNS、PubSub、Chat、Parking 類型。容器入面有多個 sub-item。
import copy
import asyncio
from collections import defaultdict
class GenericF2:
def __init__(self):
self.containers = {}
self.backups = []
self.merged_containers = {}
self.locks = defaultdict(asyncio.Lock)
# ── L1 CRUD ──
def create_container(self, timestamp, container_id, capacity):
self._purge(timestamp)
if container_id in self.containers:
return False
self.containers[container_id] = {
"capacity": capacity,
"subs": {},
"history": [],
}
return True
def add_sub(self, timestamp, container_id, sub_id, data):
self._purge(timestamp)
if container_id not in self.containers:
return False
container = self.containers[container_id]
# check sub 唔重複(全局)
for cid, c in self.containers.items():
if sub_id in c["subs"]:
return False
if len(container["subs"]) >= container["capacity"]:
return False
container["subs"][sub_id] = {
"data": data,
"expiry": None,
}
container["history"].append(sub_id)
return True
def remove_sub(self, timestamp, container_id, sub_id):
self._purge(timestamp)
if container_id not in self.containers:
return False
if sub_id not in self.containers[container_id]["subs"]:
return False
del self.containers[container_id]["subs"][sub_id]
return True
def get_count(self, timestamp, container_id):
self._purge(timestamp)
if container_id not in self.containers:
return None
return len(self.containers[container_id]["subs"])
# ── L2 Find sub(for loop) ──
def find_sub(self, timestamp, sub_id):
self._purge(timestamp)
for cid, container in self.containers.items():
if sub_id in container["subs"]:
return cid
return None
def top_n(self, timestamp, n):
self._purge(timestamp)
items = []
for cid, container in self.containers.items():
count = len(container["subs"])
items.append((-count, cid))
items.sort()
result = []
for val, cid in items[:n]:
result.append(cid)
return result
# ── L3 TTL(on sub-item) ──
def add_sub_with_ttl(self, timestamp, container_id, sub_id, data, ttl_ms):
self._purge(timestamp)
if container_id not in self.containers:
return False
container = self.containers[container_id]
for cid, c in self.containers.items():
if sub_id in c["subs"]:
return False
if len(container["subs"]) >= container["capacity"]:
return False
expiry = None
if ttl_ms is not None:
expiry = timestamp + ttl_ms
container["subs"][sub_id] = {
"data": data,
"expiry": expiry,
}
container["history"].append(sub_id)
return True
def _purge(self, timestamp):
for cid, container in self.containers.items():
for sub_id in list(container["subs"].keys()):
exp = container["subs"][sub_id]["expiry"]
if exp is not None and timestamp >= exp:
del container["subs"][sub_id]
# ── L4 Backup(有 TTL → remaining_ttl) ──
def backup(self, timestamp):
self._purge(timestamp)
snapshot = {}
for cid, container in self.containers.items():
snapshot[cid] = {
"capacity": container["capacity"],
"history": list(container["history"]),
"subs": {},
}
for sub_id, info in container["subs"].items():
remaining = None
if info["expiry"] is not None:
remaining = info["expiry"] - timestamp
snapshot[cid]["subs"][sub_id] = {
"data": info["data"],
"remaining_ttl": remaining,
}
self.backups.append((timestamp, snapshot))
def restore(self, timestamp, target_ts):
best = None
for backup_ts, snapshot in self.backups:
if backup_ts <= target_ts:
best = (backup_ts, snapshot)
if best is None:
return False
self.containers = {}
for cid, snap in best[1].items():
self.containers[cid] = {
"capacity": snap["capacity"],
"history": list(snap["history"]),
"subs": {},
}
for sub_id, info in snap["subs"].items():
new_expiry = None
if info["remaining_ttl"] is not None:
new_expiry = timestamp + info["remaining_ttl"]
self.containers[cid]["subs"][sub_id] = {
"data": info["data"],
"expiry": new_expiry,
}
return True
# ── L4 History ──
def get_history(self, timestamp, container_id):
self._purge(timestamp)
if container_id not in self.containers:
return None
return list(self.containers[container_id]["history"])
# ── L5 Batch(single + pair lock) ──
async def batch(self, timestamp, operations):
self._purge(timestamp)
async def execute_op(op):
# 雙 key:move sub between containers
if op["type"] == "move":
keys = sorted([op["from_container"], op["to_container"]])
async with self.locks[keys[0]]:
async with self.locks[keys[1]]:
from_c = self.containers.get(op["from_container"])
to_c = self.containers.get(op["to_container"])
if not from_c or not to_c:
return False
sub_id = op["sub_id"]
if sub_id not in from_c["subs"]:
return False
if len(to_c["subs"]) >= to_c["capacity"]:
return False
to_c["subs"][sub_id] = from_c["subs"][sub_id]
del from_c["subs"][sub_id]
return True
# 單 key
cid = op["container_id"]
async with self.locks[cid]:
if op["type"] == "add":
return self.add_sub(timestamp, cid, op["sub_id"], op["data"])
elif op["type"] == "remove":
return self.remove_sub(timestamp, cid, op["sub_id"])
return None
tasks = []
for op in operations:
tasks.append(execute_op(op))
results = await asyncio.gather(*tasks)
return list(results)
# ── L6 Rate Limited (fail-fast) ──
async def sync(self, timestamp, container_ids, max_concurrent):
self._purge(timestamp)
sem = asyncio.Semaphore(max_concurrent)
async def do_one(cid):
if cid not in self.containers:
return False
async with sem:
await asyncio.sleep(0.01)
return True
tasks = []
for cid in container_ids:
tasks.append(do_one(cid))
results = await asyncio.gather(*tasks)
return list(results)
適用:Workflow、Compliance、Moderation 類型。兩種唔同嘅 entity 各自有 dict。
import copy
import asyncio
from collections import defaultdict
class GenericF3:
def __init__(self):
self.groups = {}
self.items = {}
self.backups = []
self.merged_groups = {}
self.history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock)
# ── L1 CRUD ──
def create_group(self, timestamp, group_id, config):
self._purge(timestamp)
if group_id in self.groups:
return False
self.groups[group_id] = {"config": config}
return True
def create_item(self, timestamp, group_id, item_id, other_id, data):
self._purge(timestamp)
if group_id not in self.groups:
return False
if item_id in self.items:
return False
self.items[item_id] = {
"group_id": group_id,
"other_id": other_id,
"data": data,
"status": "ACTIVE",
"expiry": None,
}
self.history[group_id].append(item_id)
return True
def clear_item(self, timestamp, item_id):
self._purge(timestamp)
if item_id not in self.items:
return False
self.items[item_id]["status"] = "CLEARED"
return True
def get_active_count(self, timestamp, other_id):
self._purge(timestamp)
count = 0
for iid, item in self.items.items():
if item["other_id"] == other_id and item["status"] == "ACTIVE":
count += 1
return count
# ── L2 Sort / Find ──
def find_item(self, timestamp, item_id):
self._purge(timestamp)
if item_id not in self.items:
return None
item = self.items[item_id]
return (item["group_id"], item["other_id"])
def top_n(self, timestamp, n):
self._purge(timestamp)
counts = {}
for iid, item in self.items.items():
if item["status"] == "ACTIVE":
oid = item["other_id"]
if oid not in counts:
counts[oid] = 0
counts[oid] += 1
items = []
for oid, count in counts.items():
items.append((-count, oid))
items.sort()
result = []
for val, oid in items[:n]:
result.append(oid)
return result
# ── L3 TTL ──
def create_item_with_ttl(self, timestamp, group_id, item_id, other_id, data, ttl_ms):
self._purge(timestamp)
if group_id not in self.groups:
return False
if item_id in self.items:
return False
expiry = None
if ttl_ms is not None:
expiry = timestamp + ttl_ms
self.items[item_id] = {
"group_id": group_id,
"other_id": other_id,
"data": data,
"status": "ACTIVE",
"expiry": expiry,
}
self.history[group_id].append(item_id)
return True
def _purge(self, timestamp):
for item_id in list(self.items.keys()):
exp = self.items[item_id]["expiry"]
if exp is not None and timestamp >= exp:
self.items[item_id]["status"] = "EXPIRED"
# ── L4 Backup(有 TTL → remaining_ttl) ──
def backup(self, timestamp):
self._purge(timestamp)
snap_groups = copy.deepcopy(self.groups)
snap_items = {}
for item_id, info in self.items.items():
remaining = None
if info["expiry"] is not None:
remaining = info["expiry"] - timestamp
snap_items[item_id] = {
"group_id": info["group_id"],
"other_id": info["other_id"],
"data": info["data"],
"status": info["status"],
"remaining_ttl": remaining,
}
snap_history = copy.deepcopy(dict(self.history))
self.backups.append((timestamp, snap_groups, snap_items, snap_history))
def restore(self, timestamp, target_ts):
best = None
for entry in self.backups:
if entry[0] <= target_ts:
best = entry
if best is None:
return False
backup_ts, snap_groups, snap_items, snap_history = best
self.groups = copy.deepcopy(snap_groups)
self.items = {}
for item_id, info in snap_items.items():
new_expiry = None
if info["remaining_ttl"] is not None:
new_expiry = timestamp + info["remaining_ttl"]
self.items[item_id] = {
"group_id": info["group_id"],
"other_id": info["other_id"],
"data": info["data"],
"status": info["status"],
"expiry": new_expiry,
}
self.history = defaultdict(list)
for k, v in snap_history.items():
self.history[k] = list(v)
return True
# ── L4 History ──
def get_history(self, timestamp, group_id):
return list(self.history.get(group_id, []))
# ── L4 Merge ──
def merge_group(self, timestamp, from_id, to_id):
self._purge(timestamp)
if from_id not in self.groups or to_id not in self.groups:
return False
if from_id == to_id:
return False
for item_id, item in self.items.items():
if item["group_id"] == from_id:
item["group_id"] = to_id
self.merged_groups[from_id] = to_id
del self.groups[from_id]
return True
# ── L5 Batch ──
async def batch(self, timestamp, operations):
self._purge(timestamp)
async def execute_op(op):
# 雙 key:transfer item between other_ids
if op["type"] == "transfer":
keys = sorted([op["from_other"], op["to_other"]])
async with self.locks[keys[0]]:
async with self.locks[keys[1]]:
item = self.items.get(op["item_id"])
if not item:
return False
item["other_id"] = op["to_other"]
return True
# 單 key
item_id = op.get("item_id", "")
async with self.locks[item_id]:
if op["type"] == "create":
return self.create_item(timestamp, op["group_id"], op["item_id"], op["other_id"], op["data"])
elif op["type"] == "clear":
return self.clear_item(timestamp, op["item_id"])
return None
tasks = []
for op in operations:
tasks.append(execute_op(op))
results = await asyncio.gather(*tasks)
return list(results)
# ── L6 Rate Limited (fail-fast) ──
async def report(self, timestamp, item_ids, max_concurrent):
self._purge(timestamp)
sem = asyncio.Semaphore(max_concurrent)
async def do_one(item_id):
if item_id not in self.items:
return False
if self.items[item_id]["status"] != "ACTIVE":
return False
async with sem:
await asyncio.sleep(0.01)
self.items[item_id]["status"] = "REPORTED"
return True
tasks = []
for item_id in item_ids:
tasks.append(do_one(item_id))
results = await asyncio.gather(*tasks)
return list(results)
以下係每個 Level 可能出嘅 variance。唔係每題都有,睇 spec 決定要唔要加。直接 copy paste 入你嘅 class。
L2 基本就係 sort + query。唯一 variance 係 sort 嘅 key 唔同。
# ── Variance A:Top N by single value ──
def top_n(self, timestamp, n):
items = []
for item_id, info in self.items.items():
items.append((-info["score"], item_id)) # 負數 = desc
items.sort()
result = []
for val, item_id in items[:n]:
result.append(item_id)
return result
# ── Variance B:Top N by count of sub-items ──
def top_n_by_count(self, timestamp, n):
items = []
for cid, container in self.containers.items():
count = len(container["subs"])
items.append((-count, cid))
items.sort()
result = []
for val, cid in items[:n]:
result.append(cid)
return result
# ── Variance C:Top N by percentage ──
def top_n_by_pct(self, timestamp, n):
items = []
for cid, container in self.containers.items():
used = len(container["subs"])
cap = container["capacity"]
pct = used / cap if cap > 0 else 0
items.append((-pct, cid))
items.sort()
result = []
for val, cid in items[:n]:
result.append(cid)
return result
# ── Variance D:Prefix search ──
def search(self, timestamp, prefix):
result = []
for item_id in sorted(self.items.keys()):
if item_id.startswith(prefix):
result.append(item_id)
return result
L3 核心就係 TTL + purge。Variance 係 purge 嘅方式同觸發時機。
# ── Variance A:Standard purge(del expired) ──
def _purge(self, timestamp):
for item_id in list(self.items.keys()):
exp = self.items[item_id]["expiry"]
if exp is not None and timestamp >= exp:
del self.items[item_id]
# ── Variance B:Purge sub-items inside container ──
def _purge_subs(self, timestamp):
for cid, container in self.containers.items():
for sub_id in list(container["subs"].keys()):
exp = container["subs"][sub_id]["expiry"]
if exp is not None and timestamp >= exp:
del container["subs"][sub_id]
# ── Variance C:Status flip instead of delete ──
def _purge_status(self, timestamp):
for item_id in list(self.items.keys()):
exp = self.items[item_id]["expiry"]
if exp is not None and timestamp >= exp:
self.items[item_id]["status"] = "EXPIRED"
# ── Variance D:Lazy trigger(cashback / scheduled event) ──
def _process_scheduled(self, timestamp):
for pid, info in self.pending.items():
if not info["done"] and timestamp >= info["trigger_time"]:
# 執行 scheduled action
self.items[info["target"]]["field1"] += info["amount"]
info["done"] = True
# ── Variance E:Extend TTL ──
def extend(self, timestamp, item_id, extra_ms):
if item_id not in self.items:
return False
exp = self.items[item_id]["expiry"]
if exp is None:
return False
self.items[item_id]["expiry"] = exp + extra_ms
return True
L4 variance 最多。以下全部係獨立 add-on,按 spec 要求加。
# ══════════════════════════════════
# Variance A:Backup 冇 TTL(Bank 款)
# ══════════════════════════════════
def backup_simple(self, timestamp):
snapshot = copy.deepcopy(self.items)
self.backups.append((timestamp, snapshot))
def restore_simple(self, timestamp, target_ts):
best = None
for backup_ts, snapshot in self.backups:
if backup_ts <= target_ts:
best = (backup_ts, snapshot)
if best is None:
return False
self.items = copy.deepcopy(best[1])
return True
# ══════════════════════════════════
# Variance B:Backup 有 TTL(InMemDB 款)
# ══════════════════════════════════
def backup_with_ttl(self, timestamp):
snapshot = {}
for item_id, info in self.items.items():
remaining = None
if info["expiry"] is not None:
remaining = info["expiry"] - timestamp
snapshot[item_id] = {
"data": info["data"],
"remaining_ttl": remaining,
}
self.backups.append((timestamp, snapshot))
def restore_with_ttl(self, timestamp, target_ts):
best = None
for backup_ts, snapshot in self.backups:
if backup_ts <= target_ts:
best = (backup_ts, snapshot)
if best is None:
return False
self.items = {}
for item_id, info in best[1].items():
new_expiry = None
if info["remaining_ttl"] is not None:
new_expiry = timestamp + info["remaining_ttl"]
self.items[item_id] = {
"data": info["data"],
"expiry": new_expiry,
}
return True
# ══════════════════════════════════
# Variance C:History — value at time(Bank 款)
# ══════════════════════════════════
# create 嗰陣:item["history"] = [(timestamp, initial_value)]
# 每次改值:item["history"].append((timestamp, new_value))
def get_value_at(self, timestamp, item_id, time_at):
if item_id not in self.items:
return None
for ts, val in reversed(self.items[item_id]["history"]):
if ts <= time_at:
return val
return None
# ══════════════════════════════════
# Variance D:History — event list(Workflow 款)
# ══════════════════════════════════
# __init__: self.history = defaultdict(list)
# 每次 status 變:self.history[id].append("step1: OLD->NEW")
def get_event_history(self, timestamp, group_id):
return list(self.history.get(group_id, []))
# ══════════════════════════════════
# Variance E:Merge(Bank 款 — 數字加埋 + del source)
# ══════════════════════════════════
def merge(self, timestamp, id1, id2):
if id1 not in self.items or id2 not in self.items:
return False
if id1 == id2:
return False
self.items[id1]["field1"] += self.items[id2]["field1"]
self.items[id1]["history"].append((timestamp, self.items[id1]["field1"]))
self.merged_items[id2] = id1
del self.items[id2]
return True
# ══════════════════════════════════
# Variance F:Move / Upgrade(Hotel 款 — 搬 data + 清 source)
# ══════════════════════════════════
def move(self, timestamp, from_id, to_id):
if from_id not in self.items or to_id not in self.items:
return False
if from_id == to_id:
return False
from_item = self.items[from_id]
to_item = self.items[to_id]
if from_item["occupant"] == "":
return False
if to_item["occupant"] != "":
return False
to_item["occupant"] = from_item["occupant"]
from_item["occupant"] = ""
return True
# ══════════════════════════════════
# Variance G:Copy(FS 款 — 複製 + source 唔變 + TTL remaining)
# ══════════════════════════════════
def copy_item(self, timestamp, source_id, dest_id):
if source_id not in self.items:
return False
src = self.items[source_id]
if src["expiry"] is None:
new_exp = None
else:
remaining = src["expiry"] - timestamp
new_exp = timestamp + remaining
self.items[dest_id] = {
"data": src["data"],
"expiry": new_exp,
}
return True
# ══════════════════════════════════
# Variance H:Count-Based Eviction(Hashring 款)
# ══════════════════════════════════
# __init__: self.capacities = {}
def set_capacity(self, timestamp, container_id, capacity):
if container_id not in self.containers:
return False
self.capacities[container_id] = capacity
return True
def get_capacity(self, timestamp, container_id):
if container_id not in self.containers:
return 0
return self.capacities.get(container_id, -1)
def _evict_lru(self, container_id):
candidates = []
for sub_id, cid in self.assignments.items():
if cid == container_id:
candidates.append(sub_id)
if not candidates:
return None
lru = None
lru_time = None
for sub_id in candidates:
access = self.access_times[sub_id]
if lru is None or access < lru_time:
lru = sub_id
lru_time = access
del self.assignments[lru]
if lru in self.access_times:
del self.access_times[lru]
return lru
# store 入面加:
# capacity = self.capacities.get(container_id, -1)
# if capacity != -1 and not already_here and used >= capacity:
# self._evict_lru(container_id)
# ══════════════════════════════════
# Variance I:Size-Based Eviction + While Loop(ChatRoute 款)
# ══════════════════════════════════
# __init__: self.memory_limits = {}, self.item_sizes = {}
def set_memory_limit(self, timestamp, container_id, max_mb):
if container_id not in self.containers:
return False
self.memory_limits[container_id] = max_mb
return True
def _total_size(self, container_id):
total = 0
for sub_id, cid in self.assignments.items():
if cid == container_id:
total += self.item_sizes.get(sub_id, 0)
return total
def _evict_lru_from(self, container_id):
candidates = []
for sub_id, cid in self.assignments.items():
if cid == container_id:
candidates.append(sub_id)
if not candidates:
return None
lru = None
lru_time = None
for sub_id in candidates:
access = self.access_times[sub_id]
if lru is None or access < lru_time:
lru = sub_id
lru_time = access
del self.assignments[lru]
if lru in self.item_sizes:
del self.item_sizes[lru]
if lru in self.access_times:
del self.access_times[lru]
return lru
# assign 入面加:
# limit = self.memory_limits.get(container_id, -1)
# if limit != -1 and not already_here:
# current = self._total_size(container_id)
# while current + size_mb > limit:
# evicted = self._evict_lru_from(container_id)
# if evicted is None:
# break
# current = self._total_size(container_id)
# ══════════════════════════════════
# Variance J:Dependency + Ready/Blocked(TaskQueue 款)
# ══════════════════════════════════
# task dict 加 "dependencies": []
def _deps_met(self, task_id):
deps = self.items[task_id]["dependencies"]
for dep_id in deps:
if dep_id not in self.items:
return False
if self.items[dep_id]["status"] != "COMPLETED":
return False
return True
def get_ready(self, timestamp):
result = []
for tid, task in self.items.items():
if task["status"] != "QUEUED":
continue
if not self._deps_met(tid):
continue
result.append((-task["priority"], tid))
result.sort()
return [x[1] for x in result]
def get_blocked(self, timestamp):
result = []
for tid, task in self.items.items():
if task["status"] != "QUEUED":
continue
if self._deps_met(tid):
continue
result.append(tid)
result.sort()
return result
# ══════════════════════════════════
# Variance K:State Machine + Rollback(Workflow 款)
# ══════════════════════════════════
# __init__: self.step_status = {} ← tuple key (group_id, step_id)
# self.history = defaultdict(list)
def _set_status(self, group_id, step_id, new_status):
key = (group_id, step_id)
old_status = self.step_status[key]
self.step_status[key] = new_status
self.history[group_id].append(step_id + ": " + old_status + "->" + new_status)
def fail_step(self, group_id, step_id):
key = (group_id, step_id)
if key not in self.step_status:
return "not found"
if self.step_status[key] != "PROCESSING":
return "not processing"
# 第一件事:mark FAILED
self._set_status(group_id, step_id, "FAILED")
# 第二件事:rollback 所有 COMPLETED → PENDING
for s_id, s_name in self.groups[group_id]:
s_key = (group_id, s_id)
if self.step_status[s_key] == "COMPLETED":
self._set_status(group_id, s_id, "PENDING")
return "failed and rolled back"
L5 skeleton 一樣,只有 lock scope 唔同。
# ══════════════════════════════════
# Variance A:Single-key lock(大部分 mock)
# ══════════════════════════════════
async def execute_op(op):
key = op["item_id"]
async with self.locks[key]:
if op["type"] == "create":
return self.create(timestamp, op["item_id"], op["data"])
elif op["type"] == "delete":
return self.delete(timestamp, op["item_id"])
return None
# ══════════════════════════════════
# Variance B:Pair-lock(transfer / upgrade / copy)
# ══════════════════════════════════
async def execute_op(op):
if op["type"] == "transfer":
keys = sorted([op["source_id"], op["target_id"]])
async with self.locks[keys[0]]:
async with self.locks[keys[1]]:
return self.transfer(timestamp, op["source_id"], op["target_id"], op["amount"])
# 單 key fallback
key = op["item_id"]
async with self.locks[key]:
...
# ══════════════════════════════════
# Variance C:Worker Pool(TaskQueue 款)
# ══════════════════════════════════
async def run_workers(self, timestamp, num_workers):
completed_order = []
async def worker():
while True:
async with self._lock: # 全局 lock
tid = self._get_next_ready_task_id()
if not tid:
return # 冇嘢做 → 收工
self.items[tid]["status"] = "PROCESSING"
await asyncio.sleep(0.01) # 出 lock 先做嘢
async with self._lock:
self.items[tid]["status"] = "COMPLETED"
completed_order.append(tid)
workers = []
for _ in range(num_workers):
workers.append(worker())
await asyncio.gather(*workers)
return completed_order
L6 核心就係 sem + sleep。唯一 variance 係 fail 咗仲洗唔洗 sleep。
# ══════════════════════════════════
# Variance A:Fail-Fast(大部分 mock)
# check 唔過 → 即 return,唔入 sem
# ══════════════════════════════════
async def sync_fail_fast(self, timestamp, item_ids, max_concurrent):
sem = asyncio.Semaphore(max_concurrent)
async def do_one(item_id):
# fail-fast check(喺 sem 之前!)
if item_id not in self.items:
return False
# 過關先入 sem
async with sem:
await asyncio.sleep(0.01)
return True
tasks = []
for item_id in item_ids:
tasks.append(do_one(item_id))
results = await asyncio.gather(*tasks)
return list(results)
# ══════════════════════════════════
# Variance B:All-Sleep(InMemDB / Notification / PubSub / LogAgg)
# 全部都入 sem + sleep,冇 fail-fast
# ══════════════════════════════════
async def sync_all_sleep(self, timestamp, item_ids, max_concurrent):
sem = asyncio.Semaphore(max_concurrent)
async def do_one(item_id):
async with sem:
await asyncio.sleep(0.01) # 全部都 sleep
if item_id not in self.items:
return (item_id, None) # sleep 完先 check
return (item_id, self.items[item_id]["data"])
tasks = []
for item_id in item_ids:
tasks.append(do_one(item_id))
results = await asyncio.gather(*tasks)
return dict(results)
# ══════════════════════════════════
# Variance C:Fail-Fast + Lock + Status Change(Bank / TaskQueue 款)
# lock 包 local mutation,sem 包 external call,分開!
# ══════════════════════════════════
async def process_external(self, timestamp, items, max_concurrent):
sem = asyncio.Semaphore(max_concurrent)
async def do_one(item):
item_id = item["item_id"]
# 1. lock 包 local check + mutation
async with self.locks[item_id]:
if item_id not in self.items:
return False
if self.items[item_id]["field1"] < item["amount"]:
return False
self.items[item_id]["field1"] -= item["amount"]
# 2. sem 包 external call(出咗 lock 先!)
async with sem:
await asyncio.sleep(0.01)
return True
tasks = []
for item in items:
tasks.append(do_one(item))
results = await asyncio.gather(*tasks)
return list(results)
| 你卡喺邊 | Search 乜 | 你會搵到乜 |
|---|---|---|
| 完全唔知個 class 點開始 | python in-memory [domain] class implementation github例: python in-memory reservation system class implementation github | GitHub 上面人寫過嘅 class,睇佢 __init__ 就知 data structure |
| 唔知用 dict 定 list | python dict of dicts nested structure example | nested dict 嘅寫法同 access pattern |
| 唔知點存 key-value pairs | python key value store class implementation | InMemDB 風格嘅 class |
| 你卡喺邊 | Search 乜 | 你會搵到乜 |
|---|---|---|
| 唔識 sort dict by value | python sort dictionary by value descending | sorted(d.items(), key=lambda x: -x[1]) |
| 唔識 sort with tie-break | python sorted multiple keys tuple | sorted(items, key=lambda x: (-x[1], x[0])) |
| 唔識 prefix search | python string startswith filter dict | if key.startswith(prefix): |
| 唔識 count items matching condition | python count dict values matching condition | for loop + counter |
| 你卡喺邊 | Search 乜 | 你會搵到乜 |
|---|---|---|
| 唔知點做 TTL | python cache with TTL implementation class | item 加 expiry field + purge 邏輯 |
| 唔知點 purge expired | python remove expired items from dict | for k in list(d.keys()): if expired: del d[k] |
| 唔知 remaining TTL 點計 | python calculate remaining time to live | remaining = expiry - current_time |
| 你卡喺邊 | Search 乜 | 你會搵到乜 |
|---|---|---|
| 唔知點 backup dict | python snapshot restore deepcopy dict | import copy; snapshot = copy.deepcopy(d) |
| 唔知點 restore to timestamp | python state backup rollback by timestamp | list of (ts, snapshot),loop 搵 <= target |
| 唔知點記 history | python track value changes over time list append | history.append((timestamp, value)) |
| 唔知 remaining_ttl backup 點做 | python backup restore with TTL remaining time | backup 存 remaining = expiry - ts;restore 用 ts + remaining |
| 唔知 merge 兩個 dict entry | python merge two dict entries combine values | 加埋 numeric fields + del source |
| 你卡喺邊 | Search 乜 | 你會搵到乜 |
|---|---|---|
| 唔識 asyncio 基本寫法 | python asyncio gather example simple | async def + gather pattern |
| 唔識 async with lock | python asyncio lock per key defaultdict example | self.locks = defaultdict(asyncio.Lock) |
| 唔識 pair lock 防 deadlock | python asyncio lock two keys sorted order deadlock | keys = sorted([a, b]); async with lock[keys[0]]: async with lock[keys[1]]: |
| 唔識 worker pool | python asyncio worker pool while loop queue | N 個 worker 各自 while True loop 搶 task |
| 你卡喺邊 | Search 乜 | 你會搵到乜 |
|---|---|---|
| 唔識 semaphore | python asyncio semaphore rate limit example | sem = asyncio.Semaphore(N); async with sem: |
| 唔知 fail-fast 點寫 | python asyncio semaphore skip invalid items | check before entering sem; return False early |
| 唔知 lock + sem 點分開 | python asyncio lock then semaphore separate | lock 包 local mutation,sem 包 external sleep,唔好 nested |
| Error Message | Search 乜 |
|---|---|
TypeError: takes N args but M given | python self missing method argument |
RuntimeError: dictionary changed size during iteration | python delete from dict while iterating |
AttributeError: 'dict' has no attribute 'X' | python dict bracket vs dot access |
TypeError: unhashable type: 'list' | python list as dict key tuple instead |
object NoneType has no attribute | python function not returning value None |
python in-memory [你見到嘅 domain 名] class implementation github。八成機會搵到人寫過。用法:左右各揀一個 domain,下面會自動拆成 Init + L1 至 L6。
對位規則:同一行只會放語意對得上嘅 function;冇對應就留空,唔會夾硬拉郎配。
例:Bank 左邊配 Hotel 右邊,就會見到 create_account ↔ add_room、top_spenders ↔ top_rooms 呢啲同骨架 function 排返同一行。
| Pair | 一樣 | 最重要差異 |
|---|---|---|
| Hashring ↔ ChatRoute | ring、clockwise route、virtual nodes、topology change 後 reroute、L5 lock per request_id、L6 多數 fail-fast | ChatRoute load 係 size_mb;L4 eviction 按 memory;L6 多 bandwidth check |
| InMemDB ↔ DNS | nested store、inline TTL、ts < expiry、backup 存 remaining TTL、restore 重算 expiry、L5 lock per top-level key | InMemDB L2 係 scan / scan_by_prefix format;DNS 有 domain / resolve / wildcard 語意;L6 可能轉 fail-fast |
| TaskQueue ↔ Workflow | state machine、status transition、dependency flavour、L6 lifecycle / dispatch 味道近 | TaskQueue 有 retry/backoff + worker pool;Workflow 更似 multi-step transitions / rollback |
| Bank ↔ FileSystem | flat dict base、L2 sort/filter/format、L5 單 key / 雙 key lock、L6 fail-fast 常見 | Bank L3 係 scheduled cashback;FileSystem L3 係 TTL expiry;FileSystem 有 copy overwrite 同 quota |
呢頁唔係字典。 你要做嘅係由 `L1` 一路 run 到 `L6`,每見到一個 term,就即刻對返 mock 入面嗰個樣。
第三欄只用 mock 畫面。即係 `TTL` 唔再抽象講,而係直接話你:`FS / Session` 係點樣、`InMemDB / DNS` 又係點樣。
讀法: 先睇第二欄認樣,再睇第三欄個 mock 畫面,最後先 click 第四欄返去原文。你而家唔需要背 term 定義,只需要見到 term 就彈到返邊個 mock。
| 易混位 | A | B | 你要逼自己記住 |
|---|---|---|---|
Reject vs Overwrite |
Reject duplicateBank / Hashring / Hotel |
Overwrite existingDNS / InMemDB set |
同樣都叫 add / set,duplicate policy 可以完全相反。見到 if exists return False 定 replace existing 要即刻分。 |
TTL 兩個樣 |
Purge / 真刪FS / Session / PubSub / LogAgg |
Inline check / 留喺度但當死咗InMemDB / DNS |
Purge 係過期就唔再喺 collection 入面。Inline 係 record 仲喺 dict,但 read 嗰下 return 空 / 當 invalid。 |
TTL vs Retry |
TTL 件嘢過時,唔再有效。 |
Retry / Backoff 件嘢未死,只係改時間下次再試。 |
TTL 問「仲活唔活?」Retry 問「幾時再做?」TaskQueue L3 唔好錯當做 TTL。 |
Expire 後點處理 |
Delete / PurgeFS / Session / PubSub |
Flip statusPkgMgr DEPRECATED / OrderBook EXPIRED |
有啲 L3 過期係刪走,有啲唔刪,只係改 status。 見到 DEPRECATED / EXPIRED 就諗 status flip。 |
Backup 兩個樣 |
Plain deepcopyBank / Hotel / LogAgg |
remaining TTL restoreInMemDB / DNS |
冇 TTL 嘅 backup 直接抄。 有 TTL 嘅 restore 唔可以搬返舊 expiry,要重算 remaining。 |
One-shot vs Recurring |
One-shot 做完停。 |
Recurring 做完再排下一次。 |
Scheduler L4 最重要係:execute_at += interval,status 會重設返 SCHEDULED。 |
Retry vs Recurring |
Retry / BackoffTaskQueue L3 |
Recurring scheduleScheduler L4 |
Retry 係因為上次失敗,所以改 next_due。Recurring 係本身就要每隔一段時間再跑一次,唔係因為 fail。 |
Hashring vs ChatRoute L2 |
Count load 數幾多個 key / request。 |
Size load 數總 size_mb。 |
Hashring = 數人頭ChatRoute = 數總 MB兩個骨架一樣,但 metric 唔同。 |
Top N vs Range Filter |
Top NBank / Leaderboard / Hashring |
Range / ThresholdTaskQueue / Hashring variants |
Top N 係排完攞頭幾個。Range 係先 filter 條件,再返晒合格項目,唔一定 format 成 name(value)。 |
Health timeout vs TTL boundary |
Health(ts - last_heartbeat) <= timeout |
TTLts < expiry |
兩條都係時間比較,但 boundary 唔同。Health 多數見 <= timeout;TTL 常見係 strictly less than。 |
History Query vs Restore |
History queryBank / Workflow |
Restore stateInMemDB / DNS |
History 只係查舊畫面,查完而家個 state 唔變。Restore 係真係將系統退返去某張快照。 |
Latest <= ts vs Exact Snapshot |
Latest at or beforeInMemDB / DNS main |
Exact match onlyrestore_strict variant |
見到 most recent backup at or before 就係搵最近一張。見到 exact snapshot timestamp 就唔可以偷用 latest-before logic。 |
Single-key lock vs Pair-lock |
Single-keyInMemDB / DNS / PubSub |
Pair-lockBank transfer / FS copy / Hotel upgrade |
一個 op 只掂一個 key → 一把 lock。 一次改兩邊 → 兩把 lock,仲要 sorted() 防 deadlock。 |
Lock vs Semaphore |
Lock 保護 shared state。 |
Semaphore 限制外部 call 名額。 |
Lock 係「唔畀人同時改爛 data」。Semaphore 係「一次只放 N 個 request 出街」。L6 好多時兩樣都同時存在。 |
Gather vs Worker Pool |
Gather shell 每個 op 自己一個 coroutine。 |
Worker pool 固定 N 個 worker 搶 queue。 |
TaskQueue L5 唔係普通 gather。見到 num_workers / queue 就即刻轉去 worker pool 腦區。 |
Input Order vs Finish Order |
Return 跟 input 順序gather list / results[i] |
實際完成次序可以亂worker pool / concurrent ops |
並發入面邊個先做完可以唔同,但 spec 多數仲係要你按 input 位次返答案。TaskQueue L6 用 results[index] 就係為咗鎖返順序。 |
Fail-Fast vs All-Sleep |
Fail-Fast invalid item 唔入 sem,唔 sleep。 |
All-Sleep invalid item 都照入 sem,照 sleep。 |
Fail-fast = fail 咗冇 sleepAll-sleep = fail 咗都照 sleep呢個係 L6 最值錢分法。 |
Fail-Fast vs Lifecycle Dispatch |
Simple fail-fastBank / DNS / Hashring |
Lifecycle dispatchTaskQueue / Workflow |
有啲 L6 只係 check + sleep + return。 有啲仲要改 status: started -> dispatched / failed。 |
| Level | 全 mock 掃出嚟最常見 concept | 你已經學咗嗰幾條要特別記 | 代表 mock / level |
|---|---|---|---|
L1 |
CRUD contract、data structure、duplicate policy(reject / overwrite)、return type、order counter / id counter、list/set collection。 | Hashring / ChatRoute: ring + clockwise route。 InMemDB / DNS: nested dict 骨架。 TaskQueue: task board 基本 shape。 |
Hashring L1、ChatRoute L1、InMemDB L1、DNS L1、TaskQueue L1 Bank L1、FS L1 |
L2 |
sort / filter / search / prefix scan、tie-break、formatted output、top-N vs range filter、count / ranking / aggregate、computed metric、route + stats / load、state transition。 | ChatRoute:load = size_mb 總和,唔係 count。InMemDB / DNS: scan / scan_by_prefix。TaskQueue:開始入 state machine。 |
ChatRoute L2、InMemDB L2、DNS L2、TaskQueue L2 Bank L2、Workflow L2、OrderBook L2 |
L3 |
TTL boundary、lazy purge vs inline check、scheduled / auto-trigger、retry + backoff、virtual nodes / replicas、inactive timeout、health / status auto-change、expiry flips status。 | Hashring / ChatRoute:virtual nodes = 同一間舖開幾間分店。 InMemDB / DNS:inline TTL。 TaskQueue:retry with exponential backoff。 Bank:lazy helper 係最典型 base。 |
Hashring L3、ChatRoute L3、InMemDB L3、DNS L3 TaskQueue L3、Bank L3、PubSub L3、Chat L3 |
L4 |
backup / snapshot / restore、remaining TTL、latest-at-or-before vs exact snapshot lookup、history / historical query vs restore、merge / upgrade / move、dependency / DAG / circular check、rollback、capacity / eviction、offset / cursor / stateful read、domain-specific delta(partial fill、sticky session)。 | Hashring / ChatRoute:memory eviction / LRU。 InMemDB / DNS:restore 要重算 remaining TTL。 TaskQueue:DAG deps。 Workflow:rollback。 Bank / Hotel:merge vs upgrade。 |
Hashring L4、ChatRoute L4、InMemDB L4、DNS L4 TaskQueue L4、Workflow L4、Bank L4、Hotel L4 |
L5 |
async batch shell、preserve input order、single-key lock、pair-lock with sorted order、worker pool、lock scope(per-key / per-domain / per-topic / per-channel / per-workflow)、gather result shape。 | Hashring / ChatRoute:per-request lock。 InMemDB / DNS:per-key / per-domain lock。 TaskQueue:worker pool,唔係 gather pattern。 Bank / FS:pair-lock。 |
Hashring L5、ChatRoute L5、InMemDB L5、DNS L5 TaskQueue L5、Bank L5、FS L5、Workflow L5 |
L6 |
semaphore、check before sem vs inside sem、fail-fast vs all-sleep、external call simulation、lock + sem layering、lifecycle dispatch、extra prechecks、snapshot before await、result order / partial failure handling、domain-specific invalid rule。 | ChatRoute:多一個 bandwidth precheck。 DNS:fail-fast + semaphore。 TaskQueue / Workflow:lifecycle dispatch。 InMemDB / Notification / PubSub / LogAgg:all-sleep 味道要分清。 |
ChatRoute L6、DNS L6、TaskQueue L6、Workflow L6 InMemDB L6、Notification L6、PubSub L6、LogAgg L6 |
| 你會撞到咩 | 你要點樣認 | 你腦入面要有嘅畫面(用 mock) | 去邊個 mock / level 睇返 |
|---|---|---|---|
__init__ / base state |
開場先睇個 system 一開始養緊咩 state。最值錢係:主 collection 叫乜、id counter 有冇、後面 level 預留咗咩 field。 | 你個腦要記: 呢行唔係做 business logic,係定個 system 骨架。先睇主 dict / list / ring,再睇 counter / helper state。 |
Bank Init、Hotel Init、FS Init、Leaderboard Init InMemDB Init、DNS Init、Hashring Init、ChatRoute Init TaskQueue Init、Workflow Init、PkgMgr Init、OrderBook Init PubSub Init、Chat Init、LogAgg Init、Gym Init |
future hooks |
有啲 __init__ 已經偷放咗後面 level 會用到嘅鉤,例如 payment_counter、locks、history、pending queue。 |
你個腦要記: |
Bank Init、Hotel Init、InMemDB Init、TaskQueue Init PubSub Init、Chat Init、LogAgg Init |
| 你會撞到咩 | 你要點樣認 | 你腦入面要有嘅畫面(用 mock) | 去邊個 mock / level 睇返 |
|---|---|---|---|
flat dict |
一層 map。你拎一個 id,就直接搵到成份資料。 | Bank L1: Hotel L1: FS L1: |
Bank L1、Hotel L1、FS L1、Leaderboard L1 Notification L1、Session L1、Scheduler L1、Workflow L1 PkgMgr L1、OrderBook L1、Gym L1 |
nested dict |
上面一個 key,入面再分 field / sub-key。即係「大櫃桶入面再有細櫃桶」。 | InMemDB L1: DNS L1: |
InMemDB L1、DNS L1、Permission L1 |
hash ring |
你要諗一條街,上面有舖頭同有人。個人行到某個位,就沿住條街向前搵第一間舖。 | Hashring L1 / ChatRoute L1:街上面每間舖有自己門牌位。
|
Hashring L1、Hashring L2、ChatRoute L1、ChatRoute L2 |
duplicate / overwrite policy |
L1 好常見會順手考你:重複 add 係 reject 定 overwrite?唔同 domain 會唔同。 | Bank L1:同一個 account 再 register,多數係 reject。 DNS L1:overwrite variant 會直接蓋舊 record。 你個腦要記:同樣都係 CRUD,但 duplicate policy 可能完全相反。 |
Bank L1、DNS L1、LoadBal L1 |
task board |
你要諗一塊 task 板,key 係 task_id,value 係張單嘅資料。 | TaskQueue L1:一開始未有 status,只係記住排隊資訊。 |
TaskQueue L1 |
list / set inside dict |
有啲 L1 唔係淨係一層 value,而係 value 入面已經有 list / set。見到 topic、subscriber、messages 呢類字就要留意。 | PubSub L1:一個 topic 入面同時有 subscribers 同 messages。 Chat L1:channel 入面都會有 user list + message list。 |
PubSub L1、Chat L1 |
| 你會撞到咩 | 你要點樣認 | 你腦入面要有嘅畫面(用 mock) | 去邊個 mock / level 睇返 |
|---|---|---|---|
sort / format |
每個 L2 sort function 都係呢個流程: Step 1:call lazy helper(如果有) Step 2:開個空 list,for loop 將每個 item 砌做 tuple Step 3:sort — 用 tuple sort key 控制排序 Step 4:砌 format string across domain 都係同一個流程: |
Bank L2: Hotel L2: FS L2: 你個腦要記:呢 3 條都係由 |
Bank L2、Hotel L2、FS L2、Leaderboard L2 Notification L2、Session L2、Scheduler L2、Workflow L2 PkgMgr L2、OrderBook L2、PubSub L2、Chat L2 LogAgg L2、Gym L2 |
top-N vs range filter |
見到 top / highest / ranked 多數係排完攞頭幾個。見到 between / above / threshold 多數係 range filter。 |
Bank L2: Leaderboard L2: Hashring L2: TaskQueue L2: 你個腦要記: |
Bank L2、Leaderboard L2、Hashring L2、TaskQueue L2 |
route + stats / load metric |
呢條係 ring family 嘅 L2。唔再只係 route,仲要開始「記低邊個 item 派咗去邊個 node」,再做 load / top-N stats。 | Hashring L2: ChatRoute L2:個骨架一樣,但 load 唔係 count,係 你個腦要記: |
Hashring L2、ChatRoute L2 |
search / filter keyword |
見到 search、contains、matching、top 呢類字,通常係喺 L1 資料上面做文字 filter 或 count filter。 |
Chat L2: FS / Notification 類:先 filter,再 sort,再 format。 你個腦要記: |
Chat L2、FS L2、Notification L2 |
prefix scan |
只搵開頭係某幾個字嘅項目。見到 prefix、scan、list by prefix 就要醒。 |
InMemDB L2: DNS L2: 你個腦要記:prefix 題通常都係 |
InMemDB L2、FS L2、DNS L2、PkgMgr L2、Chat L2 |
state machine |
同一件嘢會喺幾個 status 之間轉。見到 QUEUED / PROCESSING / COMPLETED 呢類字就要轉去呢個腦區。 |
TaskQueue L2: Workflow / OrderBook variants:都係同一個 core,只係多咗 transition rule、dependency、rollback 或 fill/cancel 規矩。 |
TaskQueue L2、TaskQueue L3、TaskQueue L4 Workflow L3、Workflow L4、OrderBook L3 |
| 你會撞到咩 | 你要點樣認 | 你腦入面要有嘅畫面(用 mock) | 去邊個 mock / level 睇返 |
|---|---|---|---|
TTL + inline check |
一見到 ttl、expires_at、strictly less than,就知呢層係講時間。最重要係即刻分到:係 `先 purge`,定 `read 嗰下 inline check`。 |
Inline Check 完整 Step(from InMemDB / DNS): 一句講晒:過期嘅嘢唔刪,仲擺喺 dict 入面,但每次有人攞嗰陣先問「你仲活唔活?」,死咗就當睇唔到。 想像:垃圾仲喺地下,但每個人行過都識跨過去。冇清潔工。 Step 1:先定義「alive 判斷」— 可以抽 helper,亦可以直接 inline compare;核心係只答生唔生存,唔改 state Step 2:所有會直接讀一筆 item 嘅 method 加 check( get / resolve / lookup)Step 3:所有 scan/list/filter/query method 加存活 filter Step 4:任何會用到「件嘢仲算唔算存在」嘅 op 都要先 check( delete / redeem / consume 都算)重點:冇 helper 喺 method 開頭 call。唔係 lazy pattern。係每個操作自己 check。helper 唔一定真係抽出嚟;直接 compare 都得。 across domain: Boundary: < 唔係 <=Lazy Purge 完整 Step(from FS / Session / Permission / Gym): 一句講晒:開工前清潔工掃一次地,過期嘅真係 del 走,之後成個 system 乾乾淨淨。 想像:每次有客入門口,清潔工先掃走所有垃圾,客人先行入去做嘢。 Step 1:寫 cleanup logic — 可以係共用 helper,亦可以係題目指定 cleanup API 背後嗰段 code;payload 可以係 del item / remove child / auto check-out Step 2:所有需要見到「最新 active set」嘅 method 先 refresh;最常見係 public method 開頭,但如果 spec 另有明確 cleanup API,就由嗰個 API 觸發 Step 3:先 collect 再 remove;remove payload 可以係 item、nested child、session token,唔一定淨係 path Boundary: >= 唔係 >across domain: 兩個嘅核心分別: |
InMemDB L3、DNS L3、FS L3、Session L3 Permission L3、LRU L3、PkgMgr L3、OrderBook L3 PubSub L3、Chat L3、LogAgg L3、Gym L3 |
lazy helper |
有啲事本來應該「過一陣自動發生」,但實作唔會開 timer,而係下次有人掂到個 system 先順手補做。 | Lazy Helper 完整 Step(from Bank / FS / Notification): 一句講晒:有啲事本來應該「時間到自動發生」,但我哋唔開 timer,而係下次有人用個 system 嗰陣先順手補做。 點解叫 lazy? Step 1:寫一個 helper / refresh path,做「補做到期嘅事」;先分清 payload 係 delete、flip、加錢、扣分,定 recurring effect Step 2:預設放喺 method 開頭;但更準確講,係所有 spec 要「見到最新 lazy effect」嘅入口都要 refresh Step 3:點解係開頭唔係最尾? Across Domain(from Bank / Hotel / FS / Session / PkgMgr / Notification / Scheduler / Leaderboard): 考試點 recognize? 同其他 L3 concept 嘅關係: |
Bank L3、Hotel L3、FS L3、Leaderboard L3 Notification L3、Session L3、Scheduler L3、Permission L3 PkgMgr L3、OrderBook L3、PubSub L3、Chat L3 LogAgg L3、Gym L3 |
retry / backoff / next_due |
見到 retry、attempt、backoff、next run、retry scheduled 呢類字,就要知唔係普通 TTL,係「失敗後幾時再試」。 |
Retry / Backoff 完整 Step(from TaskQueue): 一句講晒:失敗唔係即刻死,係「等一陣再試」。等幾耐每次倍增(exponential backoff)。 同普通 fail 嘅分別: Step 1:init 加 retry 設定 + task dict 加 field Step 2:寫 configure_retry method(設定重試規則)Step 3:改「fail path」— 失敗唔即死,要 schedule 下次幾時再試; attempt 先加定後加,要跟題目公式Step 4:改「wake-up path」— due 咗之後要令 item 再次可選;有啲題係 get_next_task skip/納入,有啲題會另開 process_retriesExponential backoff 點計(from TaskQueue / Scheduler / Notification): across domain: 同其他 L3 concept 嘅比較: 考試點 recognize? |
TaskQueue L3 |
expiry flips status |
見到 DEPRECATED、EXPIRED、CLOSED 呢類 status 字眼,就要醒神:過期後件嘢唔係消失,而係仲留喺 store,但狀態變咗,之後某啲 operation 會因為個 status 被 reject。 |
Expiry Flip Status 完整 Step(from PkgMgr / OrderBook / Auction): 一句講晒:同 lazy TTL 一樣嘅做法,但過期唔刪走件嘢,改佢 status。件嘢仲喺度,但身份變咗。 同普通 TTL 嘅分別: Step 1:先加 availability marker + expiry source;marker 可以放 item dict,亦可以拆 side map Step 2:寫更新 availability 嘅 refresh path — 最常見係 lazy helper,亦可以係 inline 更新 side marker Step 3:所有要見到最新 availability 嘅入口先 update;最常見係 public method 開頭 Step 4:method 入面用新 status 做 gate PkgMgr L3:到期之後 package 唔刪,只係標做 OrderBook L3:order 過期後改做 Auction L3:auction 到 deadline 後改做 across domain 對比: 考試點 recognize? |
PkgMgr L3、OrderBook L3、Auction L3 |
auto-trigger / dual timer |
見到 trigger、ready、deadline、auto-escalate、inactive_ttl 呢類字,就知唔係普通 TTL。要先分:係 helper 推 下一步,定同一個 system 入面有 兩條時間線。 |
Auto-Trigger 完整 Step(from Workflow): Step 1:寫 helper Step 2:擺喺邊?最穩陣係「會改 state 嘅 method 最尾」;query 只有 spec 話要最新先 refresh 為咩係最尾?因為要先改完 status 再 check 有冇嘢推。 預設係:所有會改 state、而且之後可能影響下一步嘅入口都 call。Query 只有 spec 講明要最新狀態先 refresh。 Step 3:同 lazy purge 嘅分別 Workflow 嘅例子 — 邊啲 method 要 call _process_triggers:預設係:所有會改 state、而且之後可能影響下一步嘅入口都 call。Query 只有 spec 講明要最新狀態先 refresh。 同 lazy TTL purge 嘅擺法比較: 點解一個開頭一個最尾? Purge 開頭:因為你要先清走垃圾,再做正事(唔好喺過期嘅嘢上面操作)。 Trigger 最尾:因為你要先改完 status,再 check 有冇下一個要推(改完先知邊個 COMPLETED)。 Dual Timer 完整 Step(from Chat / Notification / Scheduler): Step 1:先分清兩條 timer 各自追蹤乜,再開對應 field;唔一定真係得兩個 init 參數,但一定有兩組時間 metadata Step 2:寫兩段獨立 refresh logic,各管各嘅;可以係兩個 helper,亦可以係一個 helper 入面兩段 code Step 3:所有會讀/改受影響 collection 嘅方法 refresh relevant timer;最常見係兩個都 call,但唔好背死「全部 API 一定兩個都 call」 Step 4:兩條線點樣獨立運作 across domain: |
Workflow L3、Scheduler L3、Chat L3、Notification L3 |
health / availability flip |
見到 heartbeat、timeout、healthy、unhealthy 呢類字,就知呢個 L3 唔係 TTL,而係「仲可唔可以用」。重點係 availability 係即時計出嚟,唔一定真係 delete / 改 status。 |
Health / Availability Flip 完整 Step(from LoadBalancer / Chat / Notification): 一句講晒:唔係 TTL 過期,係「你幾耐冇 heartbeat 我就當你死咗」。 同 TTL 嘅分別: Step 1:先記 liveness source + timeout;source 可以係 last_heartbeat、last_active,甚至其他 activity timestampStep 2:寫 _is_healthy 判斷 — 可以抽 helper,亦可以直接 inline compare;核心都係只答可唔可以用Step 3:所有需要依賴 availability 做決定嘅 method 只揀 healthy 嘅(route / pick / assign / list candidates) Step 4:寫 refresh liveness 嘅入口 — 最典型係 heartbeat method,但有啲 domain 會喺正常 activity 順手更新 同其他 L3 concept 嘅比較: across domain: Health 係 inline check,唔係 purge(from LoadBalancer / Chat / Notification): 係 inline check 唔係 purge — 唔刪 server Boundary 對比: across domain: 同其他 L3 concept 嘅關係: |
LoadBal L3 |
virtual nodes + replicas |
你慣用嗰個畫面就啱:同一間舖喺條街唔只一個位,而係開咗幾間分店幫手接客。 | Virtual Nodes / Replicas 完整 Step(from Hashring / ChatRoute): 一句講晒:同一間舖喺條街開幾間分店,等客人更容易行到佢。 Step 1:先 check L1 base shape 係咪已經用 positions list;replica 其實係喺同一個 node 底下加多幾個位置Step 2:寫 add_node_with_replicasStep 3:其餘 code 只有喺 L1 已經行 positions list 時先可以幾乎唔改;如果 L1 係單一 position,咁 _ring / _route / _reassign 都要跟住改Step 4: get_replica_count 好簡單across domain: 點解要 virtual nodes? 呢個 concept 最重要嘅 insight: Hashring L3 / ChatRoute L3:同一個 node/server 有幾個 positions。 即係: 點解 hash 用 f"{node_id}_{i}" 唔係 f"{node_id}"?呢個 _i suffix 係 spec 會話你嘅,唔使自己設計:你個腦要記:核心唔係「加多個 method」,而係 |
Hashring L3、ChatRoute L3 |
| 你會撞到咩 | 你要點樣認 | 你腦入面要有嘅畫面(用 mock) | 去邊個 mock / level 睇返 | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count-based evictionCount vs Size 點分 Count-based(Hashring):上限睇幾多「個」, used 係數人頭;滿咗通常踢 1 個;清 keys + key_access。Size-based(ChatRoute):上限睇 MB, used 係加總 size_mb;滿咗要用 while 踢到夠;清 requests + request_sizes + request_access。點認 maximum number of keys/items → count-based(Hashring 款)memory limit in MB/bytes → size-based(ChatRoute 款,Concept 2)max sessions per user → per-user(Session 款,Concept 8)關鍵字 capacity / maximum / limit / full / evict / LRU / least recently used / make room |
見到 capacity = 最多幾個、max keys、evict LRU 呢類字,就係最常見嗰款。新嘢入嚟之前,先踢一個最舊/最少用嘅位出街。 |
Count-Based Eviction 完整 Step(from Hashring / LRU): 一句講晒:舖頭最多住幾多個客,滿咗就踢走最舊嗰個(LRU)。 Hashring L4 嘅例子 Step 1: init 加 capacities dictData structure: Step 2:寫 set_capacity / get_capacity methodsStep 3:寫 evict_lru — 搵最舊嗰個踢走Count-Based Eviction:store path + compare(from Hashring / ChatRoute / Session): Step 4:改 store_key — 滿咗就踢一個重點:踢 一個 就夠。因為 count-based 每次入一個客、踢一個客,一對一。 Step 5:改 remove_node — 順手清 capacity |
Hashring L4、LRU L3/L4 | ||||||||||||||||||||||||
size-based while-loop eviction |
見到 size_mb、memory limit、while total > cap 呢類字,就唔係踢一個咁簡單,通常要用 while 連續踢幾個。 |
L4 Concept 2:Size-Based While-Loop Eviction(MB 容量淘汰) 一句講晒:舖有 MB 上限,新貨入嚟超過上限就用 while loop 一直踢最舊嘅嘢,直到騰出夠位。同 count-based 最大分別:一件大貨可能要踢走好幾件細貨,所以唔係 if,而係 while。Step 1: __init__ 加乜同 C1(count-based)比: Step 2: set_memory_limit — 設 MB 上限腦中畫面:老闆同倉管講「呢間倉最多放 50MB」,倉管寫落 Step 3: _total_size helper — 計一間舖用咗幾多 MB腦中畫面:行入倉,逐個貨物磅重,加埋返就知用咗幾多。 對比 count-based:Hashring 計 count 只需要 Step 4: _evict_lru_from helper — 踢一間舖入面最舊嘅 request對比 count-based:Hashring 嘅 Step 5: assign_request — 核心!while loop evictionStep 6: remove_server — 執舖要連 memory_limits 都清Count-based 對應嘅係 腦中畫面(工廠 tour): 同 count-based 對比: Data Structure 實例(Before → After) 考試點 recognize? 總結:C1 vs C2 嘅核心差異 |
ChatRoute L4 | ||||||||||||||||||||||||
backup / restore + remaining TTLL4 Concept 3:Backup / Restore + Remaining TTL(影相 + 還原) 一句講晒:影一張相( deepcopy),之後可以返去嗰張相。如果有 TTL 嘅嘢,backup 要存「剩幾耐」,restore 要用「而家 + 剩幾耐」重計到期時間。想像: 影相 = 用手機影低你間房而家嘅樣 還原 = 將間房執返去相入面嗰個樣 TTL = 枱上有杯牛奶,影相嗰陣剩 3 小時壽命 還原唔係話「死期變返嗰個鐘數」 而係「由而家起計仲有 3 小時」 |
呢兩樣通常一齊出。先影相,再返去某張相;如果條題本身有 TTL,restore 時通常唔係搬返舊 expiry,而係重算剩返幾耐命。仲要留意 spec 係咪寫 latest at or before,定係要 exact snapshot。 |
無 TTL 版本(from Bank / Leaderboard): 第一款:冇 TTL(Bank L4) 簡單版。淨係 Step 1: __init__ 開 backups listStep 2: backup method — deepcopy 成個主 dict影相之前, self.accounts 本身長咁:冇 TTL(Bank)嘅 backup 長咁: Step 3: restore method — 搵相 → 覆蓋完。3 步搞掂。 Leaderboard 嘅 across domain 列晒: Bank / Leaderboard 腦圖版: 第二款:有 TTL(InMemDB / DNS / Permission / FS) 一句講晒:backup 時唔可以直接存 expiry(因為還原嗰刻時間唔同),要存「剩幾耐」(remaining_ttl)。Restore 時用 而家 + remaining 重新計到期時間。想像: Step 1: __init__ — 同第一款一樣Step 2: backup method — 行每個 item 計 remaining_ttlData structure before → after: 腦中畫面:影相嗰陣,你唔寫「牛奶 8 點死」,而係寫「牛奶仲有 3 粒鐘壽命」。因為你還原嗰陣可能係下午 2 點,「8 點」已經過咗,但「3 粒鐘」仲可以用。 同第一款嘅分別:第一款直接 Step 3: restore method — 搵相 → 重計 expiry → 覆蓋Data structure — restore 嘅過程: 腦中畫面:你 3 點還原間房。相片寫住「牛奶剩 3 粒鐘」。3 點 + 3 粒鐘 = 6 點死。你就寫返「expiry = 6 點」喺新牛奶上面。 Step 4:完整 data flow 示範 有 TTL 款嘅 across domain 第一款 vs 第二款 對比表 共通 pattern(兩款都要記) 考試點 recognize? |
Bank L4、InMemDB L4、FS L4、Leaderboard L4 DNS L4、Permission L4、LogAgg L4 |
||||||||||||||||||||||||
history tracking |
見到 value at time X、changelog、event list 呢類字,就知係記低變更再查返。重點係:查完唔改 current state。 |
第一款:按時間查值(Bank 風格) 一句講晒:每次 value 變就 append (timestamp, new_value) 入 list,之後可以問「X 時間點嗰陣係乜值」。用 reversed loop 搵第一個 ts <= target。查完唔改 current state。Step 1: __init__ — 每個 item 開 history listData structure: 腦中畫面:開銀行戶口同時開本流水簿仔,第一行寫「開戶日,$0」。 Step 2:每次 state 變就 append deposit(單 entity 改):Data structure before → after: 腦中畫面:碌完 card,流水簿仔加一行「timestamp=3000,$500」。 transfer(雙 entity 改 — 兩邊都要記!):Data structure before → after: 腦中畫面:阿 Alice 轉 $200 畀 Bob → 兩本簿仔各加一行。 cashback 入賬(L3 helper 入面都要加):關鍵 rule:任何一個地方改 balance 都要 append。漏一個 = query 結果有 gap。Step 3:query — reversed loop 搵 ts <= target腦中畫面(最重要 — 逐步行): 點解用 reversed 唔用正序?Step 4:完整 data flow 示範 考試點 recognize(第一款)? Coding 記憶清單: 第二款:事件記錄(Workflow 風格) 一句講晒:每次 status 變就 append 一條 string 入 history[item_id],之後 get_history return 成個 list 畀人睇發生過乜。永遠只加唔刪,return copy 唔 return 原本。Step 1: __init__ — 開個 defaultdict(list)Data structure: 腦中畫面:每條流水線有一本日誌簿,冇人翻開過但一直有人寫。 Step 2: _record helper — 砌 string + append腦中畫面:有人喺日誌簿加一行「step1: PENDING->READY」,墨水唔擦得。 格式固定: "{step_id}: {old}->{new}",冇 timestamp(因為順序本身就係時序)。Step 3: _set_status — 集中入口,保證每次改 status 都記 history腦中畫面:呢個 helper 係「status 變更嘅唯一門口」。所有 method( Data structure before → after: Step 4: get_history — return copy of list點解唔直接 return self.history[workflow_id]?腦中畫面:有人要影印日誌簿,你影印一份畀佢,正本留喺度。 Step 5:完整 data flow 示範 第一款 vs 第二款 對比表
考試點 recognize(第二款)? Coding 記憶清單: |
Bank L4、Workflow L4、PkgMgr L4、Notification L4 | ||||||||||||||||||||||||
status rollback |
見到 undo、rollback on failure、revert completed steps 呢類字,就係一出事要將之前做完嘅嘢退翻。呢個係真改 current state,但唔係 snapshot restore。 |
L4 Concept 5:Status Rollback(一 fail 就退翻晒) 一句講晒:流水線做到一半出事,將出事嗰個 step mark 做 FAILED,然後 scan 成條線,將之前所有 COMPLETED 退翻 PENDING。冇 deepcopy、冇 snapshot、冇 backup,純粹改 status string。同 C3(backup / restore)嘅分別: Step 1: __init__ — data structureData structure: 腦中畫面:工廠流水線,4 個工位。前面兩個做完嘅工人已經簽咗名「COMPLETED」,第三個工人做緊做緊突然整爛咗件嘢。 Step 2: _set_status helper — 集中改 status 嘅入口點解集中入口? Step 3: fail_step — 核心 method(做兩件事)腦中畫面:先幫出事嗰個 step 掛紅旗,之後沿成條流水線行一次,凡係之前簽咗 Validation:邊啲 status 可以 fail? 只有 PROCESSING 可以 fail:Rollback 掃邊啲 + Before / After + Recognize Rollback 只掃 COMPLETED,唔掃其他唔退嘅嘢: Data structure Before → After _process_triggers 之後做乜考試點 recognize? Coding 記憶清單: 同 C4(history)嘅關係: |
Workflow L4 | ||||||||||||||||||||||||
merge / move / copy |
考試見到 merge / combine → 第一款。move / upgrade → 第二款。copy / clone → 第三款。 |
第一款:Merge(Bank) 一句講晒: acc_1 吸收 acc_2 嘅資料,然後 del acc_2。合併完 source 唔存在。Step 1: __init__ 加乜Step 2:寫 merge_accounts呢個 function 入面你要記嘅位 第二款:Move / Upgrade(Hotel) 一句講晒:將 source 房入面個客搬去 dest 房;兩間房都仲存在,但 source 要清空。 Step 1: __init__ 唔使加嘢Step 2:寫 upgrade_room呢個 function 入面你要記嘅位 第三款:Copy(FS) 一句講晒:將 source file 嘅內容抄一份去 dest;source 完全唔變,dest 可以新建,亦可以覆蓋。 Step 1: __init__ 唔使加嘢Step 2:寫 copy_file呢個 function 入面你要記嘅位 三款速記表 |
Bank L4、Hotel L4、Chat L4、FS L4 | ||||||||||||||||||||||||
dependency + readiness (DAG) |
考試見到 dependencies、blocked until、all deps completed、requires、depends on 就係呢款。核心係寫一個 helper check「前置工作全部 COMPLETED 未」,再將所有攞 next task 嘅位加返呢個 gate。 |
L4 Concept 7:Dependency + Readiness(DAG) 一句講晒:Task A 要等 Task B 做完先做得。你唔係一開波就 dispatch,而係每次攞 task 前先問「前置清單係咪全部打晒勾?」 Step 1: __init__ — task dict 加 dependencies fieldData structure: 腦中畫面:每張工單上面都有張前置清單,清單上嘅嘢全部打勾先可以開始做呢張單。 Step 2:寫 _deps_met helperStep 3:寫 add_task_with_deps注意: list(dependencies) 係 copy 一份,唔好直接借 caller 嗰條 list。Step 4:寫 get_ready_tasks(ready = QUEUED + deps met)Blocked / Next Ready / Cycle Detect Step 5:寫 get_blocked_tasks(blocked = QUEUED + deps NOT met)同 get_ready_tasks 嘅分別:只係將 _deps_met 個 True / False 反轉。Ready = met,Blocked = not met。Step 6:改 _get_next_ready_task_id呢個 function 同 get_ready_tasks 差唔多,但只係攞第一張最高 priority、同 priority 時跟 FIFO。完整 data flow: 進階:Circular Dependency(PkgMgr 味) 考試判斷:spec 有提 circular / cycle / conflict 先加呢層;冇提就唔使自己加戲。操作 checklist: |
TaskQueue L4、PkgMgr L4、Workflow L4 | ||||||||||||||||||||||||
recurring / next execution |
見到 interval、recurring、next execution 呢類字,就知同一個 item 會自動將未來時間再推前一次。 |
第一款:Recurring schedule(做完之後自己再排下一次) 一句講晒:呢個 item 做完唔係結束,而係將 execute_at 再加一次 interval,繼續留喺系統度等下次。Step 1:event 本身要有 intervalStep 2:今次觸發完之後,唔 delete;直接推前下一次時間 Before → After: 第二款:易混位 — recurring vs retry vs one-shot Recurring:本身就預設會不斷重複。 Retry:因為 fail 咗先再排。 One-shot:做完就完,唔會留低下一次時間。 考試記憶點:見到 interval、next execution 就係 recurring,唔好同 retry-scheduled 混埋。 |
Scheduler L4、Leaderboard L4 | ||||||||||||||||||||||||
per-user capacity |
見到 max per user、concurrent sessions、active count 呢類字,就唔係 LRU eviction,而係 quota / gate:呢個 user 夠位先畀你開新嘢。 |
第一款:per-user quota gate(Session 味) 一句講晒:唔係滿咗就踢人,而係先數呢個 user 而家仲有冇位;冇位就直接 reject 開新 session。 Step 1:先清走過期 session,免得你數多咗死人 session Step 2:只數同一個 user 而家 active 幾多個 Step 3:同 max_per_user 比,冇位就 return FalseStep 4:有位先 create 新 session 第二款:易混位 — quota gate vs eviction Quota gate:唔夠位就直接唔畀開。 Eviction:滿咗之後主動踢走舊 item,先騰位畀新 item 入。 考試記憶點:見到 max per user、concurrent sessions 先諗 gate;見到 evict、least recently used 先跳去 LRU。 |
Session L4 | ||||||||||||||||||||||||
partial fill 罕見 |
見到 remaining、partial、trade_qty 呢類字,就知一個 item 可以做咗一部分但未完。 |
第一款:partial fill(OrderBook 味) 一句講晒:一張單未必一次過食完,成交量要用細嗰邊,所以做完之後兩邊通常都要更新 remaining。Step 1:攞當前最匹配嗰買單同賣單 Step 2:成交量 = 兩邊 remaining 較細嗰個Step 3:兩邊都扣數 Step 4:有一邊可能變 COMPLETED,另一邊可能只係 PARTIAL第二款:易混位 — partial fill vs all-or-nothing Partial fill:做咗一部分都算有效結果,record 要保留剩餘量。 All-or-nothing:要嘛成件做晒,要嘛直接 fail / reject。 考試記憶點:一見 remaining 呢隻字就要醒,因為呢條題已經唔再係單次做完就 delete 嗰類。 |
OrderBook L4 | ||||||||||||||||||||||||
offset / cursor 罕見 |
讀取唔一定次次由頭嚟。有啲 L4 會記「上次讀到邊」。見到 offset、consume、next unread 呢類字就要醒。 |
第一款:offset / cursor(PubSub 味) 一句講晒:每個 reader 都有自己嘅書籤位,consume 唔係次次由頭讀,而係由上次停低嗰度繼續。 Step 1:用 (topic_id, user_id) 做 key,記低下次由邊個 index 開始Step 2:consume 時先攞 current offset Step 3:return 由呢個位置開始嘅 messages,之後更新 offset 腦中畫面:每個 user 都有自己張書籤,睇完就將書籤夾落最新頁數。 第二款:易混位 — cursor vs stateless read Cursor / offset:同一個 API call 第二次再叫,結果可能唔同,因為你 internal state 已經推前咗。 Stateless scan / history query:同樣 input,理論上返同樣 output。 考試記憶點:見到 next unread、last consumed、cursor 就要知道「讀取進度」本身都係 state。 |
PubSub L4 | ||||||||||||||||||||||||
sticky / affinity 罕見 |
見到 sticky、session、use previous server、failover 呢類字,就知要記住「上次派去邊」。 |
第一款:sticky / affinity(LoadBal 味) 一句講晒:同一個 user 嚟到,優先用返上次嗰間 server;只要舊 server 仲健康,就唔重新分派。 Step 1:記低 user 上次去邊 Step 2:新 request 到咗先 check 有冇 sticky mapping Step 3:如果舊 server 仲 healthy,就直接用返佢 Step 4:否則先用 round-robin / failover 揀新 server,再寫返 sticky 第二款:易混位 — sticky vs stateless routing Sticky:路由結果要記住,所以同一個 user 連續兩次可能都去同一間。 Round-robin / hash ring:多數每次靠當下規則重新計,唔一定記住上次派邊。 Failover:如果舊 server 死咗,就要清 / 覆蓋舊 mapping,轉去新嗰個。 |
LoadBal L4 |
| 你會撞到咩 | 你要點樣認 | 你腦入面要有嘅畫面(用 mock) | 去邊個 mock / level 睇返 | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
single-key lock + pair-lock |
考試見到 batch_operations、process_batch、concurrent、async,就先諗 L5。L5 本質係將你 L1-L4 已經寫好嘅同步 method,包一層 lock + gather 變成可以並行跑。 |
Pattern A:Single-Key Lock(大部分 mock) 考試見到:每個 op 只改一個 item( set, get, delete, add, remove...)。Step 1: __init__ 加 locks dictdefaultdict(asyncio.Lock):第一次 access 某個 key 就自動造一把新鎖,唔使手動 create。Step 2:寫 batch_operations(完整 function,一個 step)考試你要記嘅位: async def 外層 + 內層 async def execute_op、async with self.key_locks[key]、if/elif 入面直接 call 之前寫好嘅同步 method、tasks.append(...) → gather(*tasks) → list(results)。InMemDB L5:真 認樣位: key = op["key"]、async with self.key_locks[key]、直接 call 返 L1-L4 舊 method。DNS L5:真 同 InMemDB 差別:由 key 變做 domain,但 lock scope 都係單一 item。Permission L5:真 認樣位:如果個 op 只圍住同一個 user/resource 做嘢,仍然係 single-key family。 Pattern B:Pair-Lock(Bank transfer / Hotel upgrade / FS copy) 考試見到:一個 op 同時改兩個 item( transfer, upgrade, copy)。Step 1: __init__ 同 Pattern A 一樣,加 per-item locks dictdefaultdict(asyncio.Lock):第一次 access 某個 id/path 就自動造一把新鎖;pair-lock 題只係會一次攞兩把。Step 2:寫 process_batch(完整 function,一個 step)考試你要記嘅位: if op touches 2 items → sorted([a, b]) → 巢狀兩層 async with;如果只掂一邊,就跌返去單鎖 branch。Bank L5:真 認樣位: transfer branch 先攞 source_id / target_id 排序;deposit / pay 仍然只鎖一個 account_id。Hotel L5:真 認樣位: upgrade 同時改 from_room_id + to_room_id;book / checkout 只改一間房,所以仲係單鎖 branch。FS L5:真 認樣位: copy branch 真係完整 lock 兩個 path;add / delete 仲係每次只鎖一個 path。呢題冇 gather,但 pair-lock 個核心完全一樣。 |
InMemDB L5、DNS L5、Permission L5、Notification L5、Session L5、Scheduler L5 Workflow L5、PkgMgr L5、OrderBook L5、PubSub L5、Chat L5、LogAgg L5 Hashring L5、ChatRoute L5、Bank L5、Hotel L5、FS L5 |
||||||||||||||||||
worker pool(TaskQueue 專用) |
考試見到 num_workers、workers process tasks from queue、loop until no more,就同 A/B 完全唔同。唔係 1 op = 1 coroutine,而係 N 個 worker 各自 while True 搶 task。 |
Pattern C:Worker Pool(只有 TaskQueue) Step 1: __init__ 加全局 lockStep 2:寫 run_workers(完整 function)Step 5 重點: workers.append(worker()) 只係造出 N 個 coroutine;await asyncio.gather(*workers) 先真係同時開工。A/B vs C 對比 + Checklist
考試快判:op 只改 1 個 item → A;改 2 個 item → B;spec 講 num_workers / worker pool → C。 |
TaskQueue L5 |
async batch;如果係,再分 `單 key` 定 `雙 key`。只有見到 num_workers / queue,先切去 worker pool。本質上都係將 L1-L4 舊 method 包一層 async 殼,唔係重寫 business logic。| 你會撞到咩 | 你要點樣認 | 你腦入面要有嘅畫面(用 mock) | 去邊個 mock / level 睇返 | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
fail-fast pattern(大部分 mock) |
L6 = L5 + semaphore + sleep。見到 max_concurrent、rate limit、external call、dispatch 就切入呢級。Spec 一寫 invalid items should NOT acquire the semaphore 或 return immediately if not valid,就係 fail-fast。 |
L6:Rate-Limited External Call — 考試操作步驟 L6 = L5 + Semaphore + sleep 考試見到: max_concurrent, rate limit, external call, dispatchPattern A:Fail-Fast(大部分 mock) Step 1: __init__ 唔使加嘢Semaphore 喺 function 入面 local 開,因為每次 call 嘅 max_concurrent 可能唔同。Step 2:Skeleton(你照打嘅) Fail-Fast 嘅結構口訣 L6 考試操作 checklist Bank L6:真 inner 做緊乜:一批轉賬要過去海外銀行。先喺本地 check 餘額夠唔夠,唔夠就即走。夠先真正打電話去海外銀行過數。 Hashring L6:真 inner 做緊乜:將客人嘅資料同步去其他分店。先 check 客人同目標分店存唔存在,唔存在就唔使同步。存在先真正傳資料過去。 你要見到:兩個都係 L6 fail-fast 骨架:先做 local check,過關先入 sem;Bank 額外多一層 lock,因為佢仲要改 account state。其他 fail-fast mock(Hotel / FS / Leaderboard / Session / Scheduler) Hotel — send_notifications:通知住客退房時間。先 check 呢間房有冇人住,冇人就唔使通知。有人先真正發推送。FS — sync_files:將檔案同步去外部 storage。先 check 個檔案存唔存在,唔存在就唔使 sync。存在先真正上傳。Leaderboard — sync_scores:將玩家分數同步去外部排行榜。先 check 玩家存唔存在,唔存在就唔使 sync。存在先真正傳。Session — sync_sessions:將 session 狀態同步去外部 session store。先 check session 存唔存在或者過咗期,唔啱就即走。啱先真正 sync。Scheduler — dispatch_events:將到期嘅 event 派去外部執行。先 check event 存唔存在同係咪到期,唔係就即走。係先真正派出去。DNS L6:真 inner 做緊乜:將 DNS record 傳去其他 DNS server。先 check domain 同 record 存唔存在,唔存在就唔使傳。存在先真正發出去。 ChatRoute L6:真 inner 做緊乜:將 server 上面嘅 session 搬去另一個 server。先 check source server 有冇足夠 bandwidth(流量),唔夠就唔做。夠先真正搬。 PkgMgr L6:真 inner 做緊乜:去外部 server 下載 package。先 check package 存唔存在,唔存在就唔使下載。存在先真正 download。 你要見到: L6 C1 唔止一個樣。除咗最基本 True/False,仲會有 extra local check(DNS / ChatRoute)同 richer return shape(PkgMgr 嘅 skipped / downloaded / error)。其他 fail-fast mock(Permission / LoadBal / Workflow / OrderBook / Chat) Permission — sync_permissions:將權限同步去外部 auth server。先 check user 同 resource 存唔存在,唔存在就唔使 sync。存在先真正傳。LoadBalancer — health_check:打電話去每個 server check 佢死未。先 check server 喺唔喺你嘅 list 入面,唔喺就唔使 check。喺先真正 ping 佢。Workflow — execute_external:將 ready 嘅 step 派去外部執行。先 check step 係咪 READY,唔係就唔做。係先真正執行,執行完改做 COMPLETED 或 FAILED。OrderBook — settle_trades:將成交嘅交易送去交易所結算。先 check trade 存唔存在同係咪已成交,唔係就即走。係先真正發去結算。Chat — sync_messages:將訊息同步去外部 server。先 check channel 有冇 message,冇就唔使 sync。有先真正傳。 |
Bank L6、Hotel L6、Hashring L6、ChatRoute L6 FS L6、Leaderboard L6、Session L6、Scheduler L6 DNS L6、Permission L6、LoadBal L6、Workflow L6 PkgMgr L6、Chat L6、Gym L6、Tetris L6 Library L6、Playlist L6、Auction L6、OrderBook L6 |
||||||||||||
all-sleep pattern(先 sleep,後知衰) |
Spec 冇講 skip invalid,或者直頭寫明 simulate call for all、every item takes processing time,就係呢款。核心分別得一個:冇 check,直接入 sem + sleep。 |
Pattern B:All-Sleep(InMemDB / Notification / PubSub / LogAgg) Step 1:Skeleton 同 fail-fast 嘅分別就一個位
點解 nokey 都要 sleep?因為 all-sleep 模擬嘅係真讀 disk。你要行去個抽屜前面、拉開佢,先知入面原來係空。唔可以企喺門口就預先 skip。InMemDB L6:真 做緊乜:幫你一次過打開好多個抽屜睇入面有乜。每個抽屜都要真正打開(讀 disk),就算入面冇嘢都要開先知。 Notification L6:真 做緊乜:推送 alert 畀用戶。每個 alert 都要真正打去推送 server 試一次,推完先知成功定失敗。你本地唔知用戶裝置 token 有冇效。 你要見到:呢個都係 all-sleep,但比 batch_scan 再明顯。佢唔係先 check alert 存唔存在,而係 sleep 完先知 send 失敗。PubSub L6:真 做緊乜:將 message 推送畀所有 subscriber。每個 subscriber 都要真正打電話推一次,推完先知收唔收到。你唔知佢 online 定 offline。 LogAgg L6:真 做緊乜:將 log 匯出去外部 analytics 平台。每個 log batch 都要真正傳過去,傳完先知成功定失敗。 一眼對比:Fail-Fast vs All-Sleep
Spec 快判: without acquiring semaphore / return immediately if invalid / skip invalid items → fail-fast。Spec 快判: simulate call for all / every item takes processing time → all-sleep。你要見到:all-sleep 至少有 3 個樣: scan 後返結果、sleep 完先 check、attempt every item + try/except 收結果。 |
InMemDB L6、Notification L6、PubSub L6、LogAgg L6 | ||||||||||||
worker pool + lifecycle(TaskQueue 專用) |
見到 dispatch completed tasks、mark as DISPATCHED、status transition during external call 就係呢款。佢唔係普通 fail-fast gather,會用 results[index] 呢種 shared list 保返 input 順序。 |
Pattern C:Worker Pool + Lifecycle(只有 TaskQueue) 做緊乜:將做完嘅 task 派去外部系統。先 check task 存唔存在同係咪 COMPLETED,唔係就即走。係先真正派出去,派完改做 DISPATCHED。 Step 1:Skeleton 同 Pattern A 嘅分別 TaskQueue L6:真 做緊乜:將做完嘅 task 派去外部系統。先 check task 存唔存在同係咪 COMPLETED,唔係就即走。係先真正派出去,派完改做 DISPATCHED。 |
TaskQueue L6 |
最實用嘅 run 法: `L1` 先認骨架,`L2` 認 scan/sort,`L3` 專分 time-based variants,`L4` 睇 restore / deps,`L5` 睇 lock shape,`L6` 睇 fail-fast / all-sleep。你而家係可以直接照呢頁由上掃到落。
用法:左邊當真 spec 咁睇,右邊只睇 3 行。`先判`、`signal`、`點解`。你要練到睇左邊 10 秒內自己都講到右邊。
下面分兩層: 上面係短 spec,幫你快分 domain。下面係長 spec,語氣刻意寫到更似 CodeSignal progressive 題。
注意: 呢啲係 synthetic mock,唔係 leaked 原文。語氣主要參考公開 CodeSignal progressive docs 同 1Point3Acres 上面公開到嘅 Anthropic / cloud storage / in-memory DB 摘要。
| Mock Spec | 答案 |
|---|---|
|
先判: Hashring / ChatRoute family Signal: 點解: 呢幾個字一出,就已經係 ring routing。後面再加 |
|
先判: InMemDB family Signal: 點解: |
|
先判: TaskQueue / Workflow family Signal: 點解: 呢條題核心唔係 CRUD,係 state transition。再加 |
|
先判: FileSystem family,base 近 Bank,但後面有 Inventory 味 Signal: 點解: 一開始係 flat dict。 |
|
先判: L6 all-sleep pattern Signal: 點解: 呢條題最值錢嘅唔係 domain 名,而係最後一句。佢清楚講咗 invalid item 都要照 simulate,所以係 all-sleep。 |
|
先判: Workflow family,L6 lifecycle / dispatch 味 Signal: 點解: 呢條題係一個 step 嘅生命週期。佢著重「開始 → 外部 call → 完成 / error」,所以比普通 batch 更似 lifecycle / dispatch。 |
用法: 你如果做完一輪,仲係分唔清,就只睇每題嘅 Signal 行,逼自己用 1 句講返點解。
呢批特登寫長少少,模擬你入場時會先見到 1-2 段背景,再見到 Part / Level 要求。
| Long Mock Spec | 答案 |
|---|---|
|
先判: Hashring / ChatRoute family Signal: 點解: 前半已經係 consistent hashing。後半再加 |
|
先判: InMemDB family Signal: 點解: 呢個幾乎就係公開 InMemDB 類題目嘅語氣。`key + field` 代表 nested dict;`remaining TTL` 直接指向 L4 restore with TTL recalc。 |
|
先判: DNS family Signal: 點解: 本質仍然係 InMemDB 骨架,但 business words 換成 DNS。`domain + record + resolve` 係第一個 signal;`remaining lifetime` 話畀你知 restore 規則都係同一類。 |
|
先判: FileSystem / Cloud Storage family Signal: 點解: 呢個 opening 語氣係公開 cloud storage 類摘要真係會見到。骨架係 flat dict,但 |
|
先判: TaskQueue family Signal: 點解: 關鍵唔係 CRUD,而係 state transition。最後一句直頭講埋 worker pool shape,呢個就同普通 batch gather 分開咗。 |
|
先判: Workflow family Signal: 點解: 呢個比 TaskQueue 更似 lifecycle orchestration。重點係一個 step 嘅生命週期,同埋 dispatch ready steps。 |
|
先判: Inventory / Multi-collection family Signal: 點解: 一見到兩個 warehouse 同時改,你就應該由 single-key 轉去 pair-lock 思維。呢類唔係單 record 題,係 multi-collection。 |
|
先判: L6 all-sleep pattern Signal: 點解: 呢條最重要唔係 domain,而係最後嗰句 toggle。佢明講 invalid item 唔可以 fail-fast,要入 semaphore 條路。 |
|
先判: Hotel family Signal: 點解: 開頭係 flat dict + lazy helper 味。到 |
最似真場景嘅練法: 先只睇 long mock 左邊第一段。未睇 Part 1/2/3 前,先估 family。之後再睇後面 levels,先估 L3 / L5 / L6 variant。
一個銀行系統:建 account → 存錢 → 轉賬 → 付款 → 排名 → 歷史查詢 → 合併 → 並行處理。
# Bank = 最基本嘅 dict-of-dicts 系統
# 核心 data structure:
self.accounts = {} # account_id → {balance, outgoing, history}
self.payment = {} # payment_id → {account_id, cashback_amount, ...}
# 每個 method 第一個 param 都係 timestamp
# L3 開始每個 method 開頭都 call self._process_cashbacks(timestamp)
# Level 進程:
# L1: CRUD(建/存/轉)
# L2: Sort/Filter/Format(排名)
# L3: Timestamp/TTL/Lazy(付款 + cashback)
# L4: Backup/Merge/History(歷史查詢 + 合併)
# L5: Concurrent Batch(async gather + lock)
# L6: Rate Limited External(+ semaphore)
Bank 同其他 system 嘅分別:
Bank 用 dict-of-dicts(accounts 入面又係 dict)
Hashring 用 dict + list(nodes 入面有 positions list)
Hotel 用 dict + set(rooms 用 set 記邊日 booked)
Bank 嘅特色:
1. 每個 method 都有 timestamp param
2. L3 開始有 lazy processing(_process_cashbacks)
3. transfer 涉及兩個 account → L5 要 lock 兩個
4. 所有數值操作都用 integer(唔用 float)
accounts 實際樣子:
{
"alice": {"balance": 500, "outgoing": 100, "history": [(1,0), (3,500)]},
"bob": {"balance": 200, "outgoing": 0, "history": [(2,0), (4,200)]}
}
每個 account 有三個 field:
balance = 而家有幾多錢(int)
outgoing = 總共轉/付咗幾多出去(int,排名用)
history = [(timestamp, balance), ...] 每次改完 balance 都 append
import asyncio # L5 async 用
import copy # L4 deepcopy 用
import bisect # 有啲題 binary search 用
from collections import defaultdict # L5 auto-create lock 用
class BankSystem:
def __init__(self):
self.accounts = {} # L1 — account_id → {balance, outgoing, history}
self.payment_counter = 0 # L3 加 — auto-increment payment ID
self.payment = {} # L3 加 — payment_id → {account_id, cashback_amount, ...}
self.merged_accounts = {} # L4 加 — old_id → new_id(合併記錄)
self.account_locks = defaultdict(asyncio.Lock) # L5 加 — 每個 account 一把鎖
self.accounts 實際樣子:
{
"alice": {
"balance": 500,
"outgoing": 100,
"history": [(1,0), (3,500)]
},
"bob": {
"balance": 200,
"outgoing": 0,
"history": [(2,0), (4,200)]
}
}
點攞 data:
acc = self.accounts["alice"]
print(acc)
# {"balance": 500, "outgoing": 100, "history": [(1,0),(3,500)]}
bal = acc["balance"]
print(bal)
# 500
acc["balance"] += 100
print(acc["balance"])
# 600
# alice 嘅 balance 真係變咗 600(acc 係 pointer)
bal += 100
print(bal)
# 600
# 但 alice 嘅 balance 仲係 600(bal 係 copy)
# Helper: _process_cashbacks — 每個 method 開頭都 call(L3 開始)
# Lazy processing:唔會自動到期,要等有人 call method 帶 timestamp 入嚟先 check
def _process_cashbacks(self, timestamp): # 行晒所有 payment,到期嘅入賬
for pid, p in self.payment.items(): # 逐個 payment 睇
if not p["received"] and timestamp >= p["cashback_time"]: # 未收過 + 到期
self.accounts[p["account_id"]]["balance"] += p["cashback_amount"] # 到鐘就將回贈重新派返入客人戶口;等於商店而家先找數
p["received"] = True # 標記已收(下次唔再入)
_process_cashbacks 嘅職責:
每次有人 call 任何 method 帶 timestamp 入嚟
→ 行晒 self.payment dict
→ 搵未收過 (received=False) + 到期 (timestamp >= cashback_time) 嘅
→ 將 cashback_amount 加返入 account 嘅 balance
→ 標記 received = True
# 點解叫 "lazy"?
# 因為唔係到期就自動入賬
# 要等下一次有人 call method 先 check
# 例如 cashback_time = 100
# 但如果 timestamp=99 call create_account
# → check: 99 < 100 → 唔入賬
# timestamp=101 call deposit
# → check: 101 >= 100 → 入賬!
self.payment 實際樣子:
self.payment = {
"payment1": {
"account_id": "alice",
"cashback_amount": 10,
"cashback_time": 86401000,
"received": False
}
}
建 account,存錢(deposit),轉賬(transfer)。每個 method check 存唔存在 + return 適當嘅值。
def create_account(self, timestamp, account_id): # 開新戶口
self._process_cashbacks(timestamp) # L3 加:先處理到期嘅 cashback
if account_id in self.accounts: # 呢個戶口存唔存在?
return False # 已經有 → 拒絕重複建
self.accounts[account_id] = { # 開個新戶口,記低資料
"balance": 0, # L1
"outgoing": 0, # L2 加
"history": [(timestamp, 0)] # L4 加
}
return True # 開成功
def deposit(self, timestamp, account_id, amount): # 存錢入戶口
self._process_cashbacks(timestamp) # L3 加:先處理到期嘅 cashback
if account_id not in self.accounts: # 戶口唔存在?
return None # 搵唔到 → 冇嘢做
acc = self.accounts[account_id] # 攞出嚟(pointer,改佢即改原本)
acc["balance"] += amount # 加錢入去
acc["history"].append((timestamp, acc["balance"])) # L4 加
return acc["balance"] # 回傳新餘額
create_account(1000, "alice") 之後:
self.accounts = {
"alice": {
"balance": 0,
"outgoing": 0,
"history": [(1000, 0)]
}
}
deposit(3000, "alice", 500) 之後:
self.accounts = {
"alice": {
"balance": 500, # 0+500
"outgoing": 0,
"history": [(1000,0), (3000,500)] # 加咗
}
}
dict access 常見錯:
d.balance # 錯 — dot 係 class 用
d["balance"] # 啱 — dict 用 bracket
d["account_id"] # 錯 — 搵 literal string
d[account_id] # 啱 — 用 variable 嘅值
acc = d["alice"] # pointer,改 acc 改原本
bal = d["alice"]["balance"] # copy,改 bal 唔影響
def transfer(self, timestamp, source_id, target_id, amount): # 由一個戶口轉錢去另一個
self._process_cashbacks(timestamp) # L3 加:先處理到期嘅 cashback
if source_id not in self.accounts or target_id not in self.accounts: # 兩個戶口都存在?
return None # 有一個唔喺度 → 唔做
if source_id == target_id: # 自己轉畀自己?
return None # 唔准
s_acc = self.accounts[source_id] # 攞 source 戶口(pointer)
t_acc = self.accounts[target_id] # 攞 target 戶口(pointer)
if s_acc["balance"] < amount: # source 夠唔夠錢?
return None # 唔夠 → 唔做
s_acc["balance"] -= amount # source 扣錢
s_acc["outgoing"] += amount # L2 加
s_acc["history"].append((timestamp, s_acc["balance"])) # L4 加
t_acc["balance"] += amount # target 收錢
t_acc["history"].append((timestamp, t_acc["balance"])) # L4 加
return s_acc["balance"] # 回傳 source 嘅新餘額
transfer(5000, "alice", "bob", 200) 之前:
"alice": {"balance": 500, "outgoing": 0}
"bob": {"balance": 200, "outgoing": 0}
之後:
"alice": {"balance": 300, "outgoing": 200}
"bob": {"balance": 400, "outgoing": 0}
s_acc = self.accounts["alice"] # pointer
s_acc["balance"] -= 200 # alice 變 300
t_acc = self.accounts["bob"] # pointer
t_acc["balance"] += 200 # bob 變 400
Check 順序:
1. 兩個 account 都存在?
2. 唔係同一個?
3. source 夠錢?
→ 全部 pass 先做嘢
排名(top_spenders):按 outgoing desc 排,同分按名 asc。Return format "name(value)"。
def top_spenders(self, timestamp, n): # 排頭 N 個花最多錢嘅戶口
self._process_cashbacks(timestamp) # 先處理到期 cashback
sorted_items = dict(sorted( # 排序:outgoing 大排先,同分按名 asc
self.accounts.items(), # 將全部戶口攤平畀 sorted 排榜;之後先可以揀出 top spenders
key=lambda x: (-x[1]["outgoing"], x[0]) # -值 = desc,名 = asc
))
result = [] # 準備裝 format 完嘅 string
for account_id, acc in sorted_items.items(): # 逐個戶口行
result.append(f"{account_id}({acc['outgoing']})") # 格式化成 "名(值)"
return result[:n] # 淨攞頭 N 個
sorted 點運作:
self.accounts.items() 出嚟嘅 x:
x = ("alice", {"balance":500, "outgoing":300})
x = ("bob", {"balance":200, "outgoing":100})
x = ("charlie",{"balance":800, "outgoing":300})
x[0] = "alice" # account name
x[1] = {"balance":500} # 成個 dict
x[1]["outgoing"] = 300 # 攞 outgoing 值
key=lambda x: (-x[1]["outgoing"], x[0])
# #
-300 排先(desc) 同分按名排(asc)
排完:
(-300, "alice") # alice 先(a < c)
(-300, "charlie") # charlie 後
(-100, "bob") # bob 最尾
result = ["alice(300)", "charlie(300)", "bob(100)"]
result[:2] = ["alice(300)", "charlie(300)"]
f-string 點用:
account_id = "alice"
acc["outgoing"] = 300
f"{account_id}({acc['outgoing']})"
# "alice(300)"
pay 扣錢 + schedule cashback(24h 後自動入賬)。每個 method 開頭 call _process_cashbacks 處理到期嘅。get_payment_status return "IN_PROGRESS" 或 "CASHBACK_RECEIVED"。
def pay(self, timestamp, account_id, amount): # 付款(觸發 scheduled cashback)
self._process_cashbacks(timestamp) # 先處理到期嘅 cashback
if account_id not in self.accounts: return None # 戶口唔存在 → 唔做
acc = self.accounts[account_id] # 攞戶口 pointer
if acc["balance"] < amount: return None # 唔夠錢 → 唔做
acc["balance"] -= amount # 扣錢
acc["outgoing"] += amount # 記去 outgoing(排名用)
acc["history"].append((timestamp, acc["balance"])) # 記歷史
cashback = amount * 2 // 100 # 計算 cashback amount(2%,整數除法)
cashback_time = timestamp + 86400000 # 24 小時後(毫秒)
self.payment_counter += 1 # auto-increment ID
pid = f"payment{self.payment_counter}" # "payment1", "payment2", ...
self.payment[pid] = { # 記低 payment 資料
"account_id": account_id, # 邊個 account 嘅
"cashback_amount": cashback, # 回贈幾多
"cashback_time": cashback_time, # 幾時到期可收
"received": False # 未收過
}
return pid # 回傳 payment ID
pay(1000, "alice", 500) 做咗咩:
1. _process_cashbacks(1000) — 先處理舊嘅到期 payment
2. check alice 存在 + 夠錢
3. 扣 500:alice balance 由 1000 → 500
4. outgoing += 500
5. history append (1000, 500)
6. 計 cashback = 500 * 2 // 100 = 10
7. cashback_time = 1000 + 86400000 = 86401000
8. payment_counter: 0 → 1
9. self.payment["payment1"] = {
"account_id": "alice",
"cashback_amount": 10,
"cashback_time": 86401000,
"received": False
}
10. return "payment1"
L3 嘅 __init__(加 payment 相關):
def __init__(self):
self.accounts = {}
self.payment_counter = 0 # L3 加
self.payment = {} # L3 加
self.merged_accounts = {} # L4 加
self.account_locks = defaultdict(asyncio.Lock) # L5 加
def get_payment_status(self, timestamp, account_id, payment_id): # 查付款狀態
self._process_cashbacks(timestamp) # 先處理到期嘅 cashback
if account_id not in self.accounts: return None # 戶口唔存在
if payment_id not in self.payment: return None # payment 唔存在
if self.payment[payment_id]["account_id"] != account_id: return None # 唔屬於呢個戶口
if self.payment[payment_id]["received"]: # cashback 收咗未?
return "CASHBACK_RECEIVED" # 已收
return "IN_PROGRESS" # 未到期,仲等緊
get_payment_status 嘅 check 順序:
1. account 存唔存在? → None
2. payment_id 存唔存在? → None
3. payment 屬唔屬於呢個 account? → None
4. received == True? → "CASHBACK_RECEIVED"
5. 都唔係? → "IN_PROGRESS"
例子:
get_payment_status(500, "alice", "payment1")
→ payment1 嘅 cashback_time = 86401000
→ 500 < 86401000 → 未到期 → received 仲係 False
→ return "IN_PROGRESS"
get_payment_status(86402000, "alice", "payment1")
→ _process_cashbacks(86402000) 會先入賬!
→ received 變 True
→ return "CASHBACK_RECEIVED"
get_balance(time_at) 查歷史某個時間嘅 balance(reversed loop history)。merge 合併兩個 account(balance + outgoing 加埋,del 第二個)。
def get_balance(self, timestamp, account_id, time_at): # 查餘額(可以查某個時間點嘅)
self._process_cashbacks(timestamp) # 先處理到期嘅 cashback
if account_id not in self.accounts: return None # 戶口唔存在
acc = self.accounts[account_id] # 攞戶口 pointer
for ts, val in reversed(acc["history"]): # 由尾搵返去
if ts <= time_at: # 搵到 <= 嗰個時間點嘅 entry
return val # 回傳嗰刻嘅 balance
get_balance 點運作:
history = [(1000,0), (3000,500), (5000,300)]
get_balance(10000, "alice", 4000) → 問:timestamp=4000 嗰陣幾多錢?
reversed 由尾行:
(5000,300) → 5000 > 4000 → skip
(3000,500) → 3000 <= 4000 → return 500!
點解用 reversed?
因為 history 係 append-only(timestamp 升序)
由尾搵 = 搵最後一個 <= time_at 嘅 entry
= 嗰個時間點嘅 balance
L4 嘅 __init__(加 merged_accounts):
def __init__(self):
self.accounts = {}
self.payment_counter = 0
self.payment = {}
self.merged_accounts = {} # L4 加
self.account_locks = defaultdict(asyncio.Lock)
def merge_accounts(self, timestamp, account_id_1, account_id_2): # 合併兩個戶口
self._process_cashbacks(timestamp) # 先處理到期嘅 cashback
if account_id_1 not in self.accounts or account_id_2 not in self.accounts: return False # 兩個都要存在
if account_id_1 == account_id_2: return False # 唔准自己合自己
acc_1 = self.accounts[account_id_1] # 攞主戶口(留低嗰個)
acc_2 = self.accounts[account_id_2] # 攞副戶口(被吸收嗰個)
acc_1["balance"] += acc_2["balance"] # 合併 balance
acc_1["outgoing"] += acc_2["outgoing"] # 合併 outgoing
acc_1["history"].append((timestamp, acc_1["balance"])) # 記歷史
self.merged_accounts[account_id_2] = account_id_1 # 記錄:舊 → 新
del self.accounts[account_id_2] # 刪舊 account
return True # 合併成功
merge_accounts(10, "alice", "bob") 做咗咩:
之前:
alice: balance=300, outgoing=200
bob: balance=400, outgoing=0
之後:
alice: balance=700, outgoing=200
bob: 已刪(del)
self.merged_accounts = {"bob": "alice"}
合併規則:
1. 兩個都要存在 + 唔係同一個
2. acc_1 吸收 acc_2 嘅 balance + outgoing
3. 記低邊個 merge 去邊個
4. 刪 acc_2
backup / deepcopy:
import copy
snapshot = copy.deepcopy(self.accounts)
# 完整複製,改 snapshot 唔影響原本
# 用途:backup / restore
同時處理多個 operations。Skeleton:async def + execute_op + gather + lock。入面嘅 if/elif 因應題目改。抄 template 改幾行就得。
# 第一步:先定義點樣處理單一 op
# 第二步:按張單會郁幾多個 account 去攞鎖
# 第三步:收集晒 coroutine,再一次過 gather
async def process_batch(self, timestamp, operations): # 批量處理一堆銀行動作,好似同時開幾個 clerk 做單
self._process_cashbacks(timestamp) # 開工前先補返到期 cashback,避免舊數未入賬就即刻做新單
async def execute_op(op): # 每次只處理一張單,最後會畀 gather 一齊放出去跑
# 第一步:transfer 會同時郁兩個戶口,所以要先排好鎖嘅次序
if op["type"] == "transfer": # 呢種單係由一個戶口搬錢去另一個戶口
keys = sorted([op["source_id"], op["target_id"]]) # 永遠照同一個次序攞鎖,避免兩張單互相等死
async with self.account_locks[keys[0]]: # 先鎖排前面嗰個戶口
async with self.account_locks[keys[1]]: # 再鎖第二個戶口,兩邊都安全先真正過數
return self.transfer(timestamp, op["source_id"], op["target_id"], op["amount"]) # 借返同步 transfer 真正搬錢,呢度只負責排隊同保護資料
# 第二步:deposit / pay 只會郁一個戶口,鎖一邊就夠
aid = op["account_id"] # 呢張單實際上會改邊個戶口,就用佢做排隊 key
async with self.account_locks[aid]: # 同一個戶口一次只畀一張單入去改,避免 balance 互相踩爛
if op["type"] == "deposit": # 呢張係入錢單
return self.deposit(timestamp, op["account_id"], op["amount"]) # 交返畀同步 deposit 真正加錢
elif op["type"] == "pay": # 呢張係付款單
return self.pay(timestamp, op["account_id"], op["amount"]) # 交返畀同步 pay 真正扣錢同記 outgoing
# 第三步:將所有單包成 coroutine,一次過同時開工
tasks = [] # 呢個 list 好似待辦單夾,逐張收埋等陣一齊交畀 gather
for op in operations: # 行晒你傳入嚟嘅每張單
tasks.append(execute_op(op)) # 先包成 coroutine,未真跑住
results = await asyncio.gather(*tasks) # 一次過放晒出去跑,各自靠 lock 保護同一個戶口
return list(results) # caller 會收到每張單各自嘅結果,順序同原本 operations 一致
L5 嘅 __init__(加 account_locks):
def __init__(self):
self.accounts = {}
self.payment_counter = 0
self.payment = {}
self.merged_accounts = {}
self.account_locks = defaultdict(asyncio.Lock) # L5 加
L5 嘅 data structure:
冇加新 user data field
多咗 self.account_locks(concurrency 用)
defaultdict(asyncio.Lock) = 每次 access 新 key 自動建一把 Lock
Lock 點用:
transfer 要鎖兩個 account(防止 A→B 同 B→A 同時跑)
點解 sorted?防 deadlock:
Thread 1: lock(alice) → lock(bob)
Thread 2: lock(alice) → lock(bob) # 同順序 → 安全
如果唔 sort:
Thread 1: lock(alice) → lock(bob)
Thread 2: lock(bob) → lock(alice) # 互相等 → deadlock!
deposit / pay 淨係一個 account → 鎖一個就夠
L5 用嘅 helper:
冇新 helper
直接 call 返 transfer / deposit / pay
gather 同時跑所有 op,lock 保證同一 account 唔會 race
同 L5 一樣但加 Semaphore 限制同時幾多個。Lock 包住改 data,Sem 包住外部 call(sleep)。Failed 嘅 return False 唔 sleep。抄 template 改 check + 扣錢嗰段就得。
# 第一步:先喺本地鎖住戶口做 fail-fast 檢查
# 第二步:真係扣到錢先離開 lock
# 第三步:過關嘅單先入 semaphore,模擬外部 API 慢慢處理
async def process_external_transfers(self, timestamp, transfers, max_concurrent): # 一批外部轉賬同時做,但真外部位會限流
self._process_cashbacks(timestamp) # 開工前先補返到期 cashback,避免餘額仲停喺舊數
sem = asyncio.Semaphore(max_concurrent) # 好似外部 API 只得 N 個窗口,一次最多放 N 單出去
async def do_transfer(t): # 每次處理一張外部轉賬單
aid = t["account_id"] # 呢張單實際上要喺邊個戶口扣錢
lock = self.account_locks[aid] # 同一個戶口好似同一個櫃位,一次只畀一張單改
# 第一步:先做本地 fail-fast 檢查,唔合格就即刻踢走
async with lock: # 改餘額同歷史之前,先鎖住個戶口
if aid not in self.accounts: # 連戶口都冇,呢張單根本冇得做
return False # 即刻作廢,唔會去搶外部 API 名額
if self.accounts[aid]["balance"] < t["amount"]: # 餘額不足,好似客人卡入面唔夠錢
return False # 一樣即場失敗,唔好白白排外面個 API 隊
# 第二步:本地戶口過到關,就正式扣錢同落簿
self.accounts[aid]["balance"] -= t["amount"] # 真正喺戶口扣走呢筆錢
self.accounts[aid]["outgoing"] += t["amount"] # 將呢筆支出加落 outgoing,之後計 top spenders 會用到
self.accounts[aid]["history"].append((timestamp, self.accounts[aid]["balance"])) # 落簿,等之後查 statement 知道扣完之後仲剩幾多
# 第三步:只得成功扣到錢嘅單,先入 semaphore 模擬外部過數
async with sem: # 而家先去搶外部 API 窗口,避免無謂阻住其他真單
await asyncio.sleep(0.01) # 模擬真正打去外部支付網關要等一陣
return True # 本地扣數同外部 call 都完成,呢張單先算真係做完
tasks = [] # 收集每張轉賬單嘅 coroutine,等陣一齊放出去跑
for t in transfers: # 行晒成批外部轉賬單
tasks.append(do_transfer(t)) # 先包成 coroutine,未真跑住
results = await asyncio.gather(*tasks) # 全部同時跑,各自靠 lock 同 sem 守規矩
return list(results) # caller 會收到每張單係成功定失敗,順序同原本 transfers 一樣
L6 嘅 __init__(同 L5 一樣):
def __init__(self):
self.accounts = {}
self.payment_counter = 0
self.payment = {}
self.merged_accounts = {}
self.account_locks = defaultdict(asyncio.Lock)
L5 vs L6 分別:
L5:gather + Lock
全部同時跑,鎖住 account 防 race
L6:gather + Lock + Semaphore
Lock 包住改 data
Sem 包住外部 call(限制同時幾多個)
分開用!唔好 nested
Lock vs Semaphore:
Lock = 一把鎖,同一時間只有一個 task 拎到
用途:保護 data(balance 扣錢唔 race)
Semaphore(N) = N 張准考證
用途:限制同時幾多個 task 做外部 call
例如 max_concurrent=3 → 同時最多 3 個 sleep
Fail-fast 重點:
check 失敗嘅 → return False
唔入 sem、唔 sleep、唔佔准考證
只有 check 過咗嘅先入 sem + sleep
呢個 pattern 喺 L6 spec 有明確要求
| Error | 意思 | 點修 |
|---|---|---|
| TypeError: takes N args but M given | param 數量錯 | check 有冇 self / 有冇漏 timestamp |
| AttributeError: 'dict' has no attribute 'X' | 用咗 dot access dict | 改做 d["X"] |
| KeyError: 'X' | dict 冇呢個 key | 先 check if key in d |
| AssertionError: None is not true | function 冇 return | 加 return True/False |
| NameError: name 'X' not defined | variable 未定義 | check 串字 / scope |
口訣:1. 睇 Error 類型 2. 睇 > 嗰行(test 點 call 你) 3. 對比你嘅 code
| 要做咩 | Code |
|---|---|
| 建空 dict | d = {} |
| 加/改 | d[key] = value |
| 刪 | del d[key] |
| check 存在 | if key in d: |
| 攞 value(safe) | d.get(key) |
| loop dict | for k, v in d.items(): |
| 建空 list | lst = [] |
| 加去尾 | lst.append(x) |
| 頭 n 個 | lst[:n] |
| format string | f"{name}({value})" |
| join list | ", ".join(lst) |
| sort desc+asc | sorted(items, key=lambda x: (-x[1], x[0])) |
| deep copy | copy.deepcopy(data) |
| reversed loop | for ts, val in reversed(lst): |
| auto ID | self.counter += 1; id = f"item{self.counter}" |
| round down | amount * 2 // 100 |
| 同時跑 | tasks = []
for x in items:
tasks.append(fn(x))
await asyncio.gather(*tasks) |
| 鎖住 | async with lock: |
| 限制數量 | sem = asyncio.Semaphore(n); async with sem: |
一間酒店:book 入住、checkout 退房、late fee 遲退罰款、upgrade 搬房。同 Bank 幾乎一樣嘅 pattern,只係 domain 唔同。
Hotel 同 Bank 嘅核心對照:
Bank account → Hotel room
Bank balance → Hotel total_revenue
Bank cashback → Hotel late_fee(lazy 收費)
Bank merge → Hotel upgrade(搬人唔合併房)
Hotel 獨有嘅嘢:
guest_name == "" → available(可 book)
guest_name != "" → occupied(已入住)
checkout 唔刪 room,只係清空 guest_name
booking_id 自動遞增:"booking1", "booking2", ...
Hotel 係 Bank 嘅「換皮版」。data structure 同邏輯 90% 一樣,改嘅只係 field name 同 domain rule。
Bank Hotel
─────────────────────── ───────────────────────
create_account add_room / book_room
get_balance get_room_info
deposit/withdraw (冇,revenue 自動計)
top_spenders top_rooms(by revenue)
cashback late_fee
merge upgrade_room(搬人唔合併)
batch_operations batch_operations(一樣)
sync send_notifications(fail-fast)
── L1 CRUD ── 🟰 book_room 同 Bank create_account 一樣 ⚠️ checkout Bank 冇對應(Hotel 獨有) 🟰 get_room_info 同 Bank get_balance 類似 ── L2 Sort ── 🟰 list_rooms 同 Bank list_accounts 一樣 🟰 top_revenue 同 Bank top_spenders 一樣 pattern ── L3 Late Fee ── ⚠️ set_late_fee Bank 用 cashback 加錢,Hotel 用 late fee 扣錢 🟰 _process_late_fee 同 Bank _process_cashbacks 一樣 lazy pattern ── L4 Backup ── 🟰 backup / restore 同 Bank 完全一樣(deepcopy) ⚠️ upgrade_room Bank merge 加 balance,Hotel upgrade 換房型 ── L5 Batch ── 🟰 batch_operations 同 Bank L5 一樣(lock per room_id) ── L6 Sync ── 🟰 sync 同 Bank L6 一樣(fail-fast + sleep)
import asyncio
from collections import defaultdict
class HotelSystem:
def __init__(self):
self.rooms = {} # L1 room_id → room info dict
self.booking_counter = 0 # L1 自動遞增 booking ID
self.rooms_locks = defaultdict(asyncio.Lock) # L5 加:per-room_id 嘅 async lock
{
"r1": {
"price_per_night": 100,
"guest_name": "Alice",
"nights": 2,
"total_revenue": 200,
"check_in_time": 1,
"late_fee": 0,
"late_fee_time": 1,
"history": [],
"booking_id": "booking1"
}
}
guest_name == "" → available
guest_name != "" → occupied
late_fee 擺喺 room 入面(唔係外面 list)
L1:rooms dict, booking_counter
L3:late_fee, late_fee_time(塞入 room dict 入面)
L4:history[](塞入 room dict 入面)
L5:rooms_locks = defaultdict(asyncio.Lock)
L6:(冇加新 field,semaphore 喺 method 入面開)
lazy = 唔係 background timer,係所有需要見到「最新 active set」嘅 method 先 refresh;最常見係 public method 開頭,但如果 spec 另有明確 cleanup API,就由嗰個 API 觸發。逐個 room check 有冇遲退費到期 → 到期就加去 revenue 再清零。
# Helper: _process_late_fee — lazy 收費(每個 public method 開頭都 call)
def _process_late_fee(self, timestamp): # 注意係 late_fee 單數
for room_id, room in self.rooms.items(): # 逐間 room 睇
if room["late_fee"] > 0 and timestamp >= room["late_fee_time"]: # 有 fee 且到期
room["total_revenue"] += room["late_fee"] # 加去 revenue
room["late_fee"] = 0 # 清零(收完了)
_process_late_fee(timestamp)
行一次 self.rooms
凡係:
1. late_fee > 0(有未收嘅費)
2. timestamp >= late_fee_time(到期)
就 revenue += fee,fee = 0
每個 public method 第一行都 call 一次(lazy 模式)
Bank:cashback 到期 → balance += amount(加錢)
Hotel:late_fee 到期 → revenue += fee(加入收入)
Bank:cashback 用獨立 list self.pending_cashbacks
Hotel:late_fee 塞喺 room dict 入面(更簡單)
共通點:都係 lazy(唔係 background timer)
都係每個 public method 開頭 call
都係到期就結算,結算完就清走
def add_room(self, timestamp, room_id, price_per_night): # 加一間新房
self._process_late_fee(timestamp) # L3 加:先處理到期嘅 late fee
if room_id in self.rooms: # 呢間房已經存在?
return False # 已有 → 唔再加
self.rooms[room_id] = { # 建立新房間,記低所有 field
"price_per_night": price_per_night, # 每晚幾錢
"guest_name": "", # 空 = 冇人住
"nights": "", # 住幾晚(book 時先填)
"total_revenue": 0, # 累計收入
"check_in_time": timestamp, # 上次 check-in 時間
"late_fee": 0, # L3 加
"late_fee_time": timestamp, # L3 加
"history": [], # L4 加
"booking_id": "" # 而家冇 booking
}
return True # 加房成功
def book_room(self, timestamp, room_id, guest_name, nights): # 訂房
self._process_late_fee(timestamp) # 先處理到期 fee
if room_id not in self.rooms: return None # 房間唔存在
room = self.rooms[room_id] # 攞房間 pointer
if room["guest_name"] != "": return None # 有人住緊 → 唔得 book
room["guest_name"] = guest_name # 記低邊個住
room["nights"] = nights # 記住幾晚
room["total_revenue"] += room["price_per_night"] * nights # 即刻計收入
room["check_in_time"] = timestamp # 記 check-in 時間
self.booking_counter += 1 # 自動遞增 ID
booking_id_str = f"booking{self.booking_counter}" # 生成 "booking1", "booking2"...
room["booking_id"] = booking_id_str # 記入房間資料
return booking_id_str # 回傳 booking ID
def checkout(self, timestamp, room_id): # 退房
self._process_late_fee(timestamp) # 先處理到期 fee
if room_id not in self.rooms: return None # 房間唔存在
room = self.rooms[room_id] # 攞房間 pointer
if room["guest_name"] == "": return None # 冇人住 → 冇得退
guest_name = room["guest_name"] # 記住客人名(等陣 return 用)
room["guest_name"] = "" # 清空 = 退咗房
room["history"].append((guest_name, room["price_per_night"] * room["nights"])) # L4 加
room["nights"] = 0 # 清零晚數
return guest_name # 回傳退房嘅人名
"r1": {
"guest_name": "Alice", ← 入住
"nights": 2,
"total_revenue": 200, ← 100 * 2
"booking_id": "booking1"
}
"r1": {
"guest_name": "", ← 走咗
"total_revenue": 200, ← 唔變
"history": [("Alice", 200)] ← 加咗
}
return "Alice" ← return 名,唔係 True
# book_room return booking ID
# 唔係 return True/balance
bid = "booking1" ← return 呢個
# checkout return guest name
# 唔係 return True/False
return "Alice" ← return 呢個
# check occupied 用 guest_name
if room["guest_name"] != "": ← occupied
if room["guest_name"] == "": ← available
def top_rooms(self, timestamp, n): # 排頭 N 間最多收入嘅房
self._process_late_fee(timestamp) # 先處理到期 fee
sorted_items = dict(sorted( # 排序:revenue 大排先,同分按房名 asc
self.rooms.items(), # 將全部房間攤平畀 sorted 排收入榜;之後先切頭 N 間
key=lambda x: (-x[1]["total_revenue"], x[0]) # -值 = desc,名 = asc
))
result = [] # 準備裝 format 完嘅 string
for room_id, room in sorted_items.items(): # 逐間房行
result.append(f"{room_id}({room['total_revenue']})") # 格式化成 "房名(收入)"
return result[:n] # 淨攞頭 N 個
def find_available(self, timestamp, min_price, max_price): # 搵有空嘅房
self._process_late_fee(timestamp) # 先處理到期 fee
sorted_items = dict(sorted( # 按價錢排(平排先)
self.rooms.items(), # 將全部房資料交畀 sorted 排價錢;之後先慢慢篩有冇人住
key=lambda x: (x[1]["price_per_night"], x[0]) # 價錢 asc,同價按名 asc
))
result = [] # 準備裝符合條件嘅房
for room_id, room in sorted_items.items(): # 逐間房行
if min_price <= room["price_per_night"] <= max_price and room["guest_name"] == "": # 價錢啱 + 冇人住
result.append(room_id) # 加入結果
return result # 回傳所有符合嘅房
self.rooms = {
"r1": {"price_per_night": 100, "guest_name": "Alice", "total_revenue": 200},
"r2": {"price_per_night": 150, "guest_name": "", "total_revenue": 450},
"r3": {"price_per_night": 80, "guest_name": "", "total_revenue": 0}
}
top_rooms 用:
room["total_revenue"] + room_id
find_available 用:
room["price_per_night"] + room["guest_name"]
self.rooms.items() 出嚟嘅 x:
x = ("r1", {"total_revenue": 200, "price_per_night": 100})
x = ("r2", {"total_revenue": 450, "price_per_night": 150})
x = ("r3", {"total_revenue": 0, "price_per_night": 80})
x[0] = "r1" # room_id
x[1] = {"total_revenue": 200} # 成個 room dict
x[1]["total_revenue"] = 200 # 攞收入
key=lambda x: (-x[1]["total_revenue"], x[0])
# #
-450 排先(desc) 同分按房名排(asc)
排完:
(-450, "r2")
(-200, "r1")
(0, "r3")
result = ["r2(450)", "r1(200)", "r3(0)"]
result[:2] = ["r2(450)", "r1(200)"]
rooms = {
"r1": {"price_per_night": 100, "guest_name": "Alice"},
"r2": {"price_per_night": 150, "guest_name": ""},
"r3": {"price_per_night": 80, "guest_name": ""}
}
find_available(1, 50, 200)
→ ["r3", "r2"]
# r1 有人住,排除
# r3(80) 排先,r2(150) 排後
# find_available 要 check occupied!
# 唔係淨 check price range
if price_ok and room["guest_name"] == "":
# ^^^^^^^^^^^^^^^^^^^^^^^^
# 漏咗呢個就全部 room 都出嚟
def late_checkout(self, timestamp, room_id, extra_hours): # 延遲退房(加 fee)
self._process_late_fee(timestamp) # 先處理到期 fee
if room_id not in self.rooms: return None # 房間唔存在
room = self.rooms[room_id] # 攞房間 pointer
if room["guest_name"] == "": return None # 冇人住 → 冇得 late checkout
fee = extra_hours * 50 # 計罰款(每小時 50)
room["late_fee"] = fee # 擺喺 room 入面
room["late_fee_time"] = timestamp + 3600000 # 1 小時後先收(deadline)
self.checkout(timestamp, room_id) # 先幫客人即刻退房交吉;罰款就另外掛單,等夠鐘先正式入酒店收入
return fee # 回傳罰款金額
def get_pending_fees(self, timestamp): # 查未收嘅 fee
self._process_late_fee(timestamp) # 先處理到期 fee
all_fee = 0 # 準備加總
for room_id, room in self.rooms.items(): # 逐間房睇
if room["late_fee"] != 0: # 呢間房有未收嘅 fee?
all_fee += room["late_fee"] # 加埋
return all_fee # 回傳所有未收 fee 嘅總和
fee = 4 * 50 = 200
deadline = 100 + 3600000 = 3600100
pending_fees = [
{"room_id":"r1", "fee":200,
"deadline":3600100}
]
guest 即刻 checkout(room available)
fee 等到 3600100 先加去 revenue
# 1. checkout 入面唔好 call late_checkout
# 兩個係獨立嘅 method
# caller 決定 call 邊個
# 2. late_checkout 要做兩件事:
# checkout guest + schedule fee
# 要 return fee amount
# 3. _process_late_fee check fee > 0
# 唔好 check guest_name
# 因為 guest 已經走咗但 fee 未收
if pf["fee"] > 0: ✅
if guest_name != "": ❌ guest 已經走咗
# 4. _process_late_fee 唔加 history
# history 已經喺 checkout 加咗
def get_booking_history(self, timestamp, room_id): # 攞房間嘅訂房記錄
self._process_late_fee(timestamp) # 先處理到期 fee
if room_id not in self.rooms: return None # 房間唔存在
room = self.rooms[room_id] # 攞房間 pointer
return list(room["history"]) # history 已經係 [(guest, cost), ...] 直接 copy
def upgrade_room(self, timestamp, from_room_id, to_room_id): # 升級房型
self._process_late_fee(timestamp) # 先處理到期 fee
if from_room_id not in self.rooms or to_room_id not in self.rooms: return False # 兩間房都要存在
if from_room_id == to_room_id: return False # 唔准自己 upgrade 自己
from_room = self.rooms[from_room_id] # 攞舊房張資料卡;等陣要照住佢將住客搬走
to_room = self.rooms[to_room_id] # 攞新房張資料卡;之後會接手個客同 booking 資料
if from_room["guest_name"] == "": return False # from 要有人
if to_room["guest_name"] != "": return False # to 要冇人
# 直接搬,唔好 call checkout
to_room["guest_name"] = from_room["guest_name"] # 搬客人名去新房
to_room["booking_id"] = from_room["booking_id"] # 搬 booking ID 去新房
to_room["nights"] = from_room["nights"] # 搬晚數去新房
from_room["guest_name"] = "" # 舊房清空(available)
return True # 升級成功
之前:
"r1": {"guest":"Alice", "total_revenue":200}
"r2": {"guest":"", "total_revenue":0}
之後:
"r1": {"guest":"", "total_revenue":200}
"r2": {"guest":"Alice", "total_revenue":0}
# Alice 搬咗去 r2
# revenue 留喺 r1(唔跟人走)
# booking_id 跟人走
# 1. history return tuple 唔係 string
("Alice", 200) ✅ tuple
f"(Alice,200)" ❌ string
# 2. history 只記 checkout 完嘅 booking
# 唔好喺 create_account 加 history
# history = [] 開始係空
# 3. upgrade 唔係 merge
# upgrade 搬 guest,唔合併 room
# revenue 留喺原 room
# 4. upgrade 唔加 history
# guest 只係搬房,唔算 checkout
# 5. Spec 冇講嘅就唔做
# checkout 唔好自動判斷 late
# 每個 method 只做 spec 講嘅嘢
同時處理 book / checkout / upgrade。Lock per room。upgrade 涉及兩個 room 要鎖兩個。
# 幾時鎖 1 個 key vs 鎖 2 個 key:
# book/checkout → 淨係改一個 room → 鎖 1 個
# upgrade/transfer → 改兩個 room → 鎖 2 個(sorted order 防 deadlock)
async def batch_operations(self, timestamp, operations): # 一次過處理成批酒店單;同房要排隊,唔同房先可以並行
self._process_late_fee(timestamp) # 開工前先收清到期罰款,避免舊數未埋單就處理新單
# 第一步:先定義點樣處理單一 operation
async def execute_op(op): # 每張單最後都會行到呢個 helper
# 第二步:upgrade 會同時郁舊房同新房,所以一定要用固定次序攞兩把鎖
if op["type"] == "upgrade": # 搬客人好似換房卡;兩邊房門都要控住
keys = sorted([op["from_room_id"], op["to_room_id"]]) # 排好先後手,避免 A 等 B、B 又等 A 咁樣互卡
async with self.rooms_locks[keys[0]]: # 先鎖字母序較前嗰間房
async with self.rooms_locks[keys[1]]: # 再鎖第二間,確保搬房途中冇其他單插隊
return self.upgrade_room(timestamp, op["from_room_id"], op["to_room_id"]) # 真正搬客交返俾 upgrade_room;呢度只負責守秩序
# 第三步:book / checkout 只郁一間房,一把鎖就夠
rid = op["room_id"] # 搵出呢張單涉及邊個房號
async with self.rooms_locks[rid]: # 鎖住房門口,確保同一時間只得一張單入得去
if op["type"] == "book": # 新客 check-in
return self.book_room(timestamp, op["room_id"], op["guest_name"], op["nights"]) # 幫客人落 booking;成功失敗照跟返原 method 規矩回報
elif op["type"] == "checkout": # 現有住客退房
return self.checkout(timestamp, op["room_id"]) # 清房交吉;caller 之後照收返退房結果
# 第四步:將每張單都包成 coroutine,再一次過開工
tasks = [] # 收集晒所有待處理訂單;等陣一齊交俾 gather
for op in operations: # 前台逐張單放入等候盤
tasks.append(execute_op(op)) # 每張單變成一個 coroutine,保留原本輸入次序
results = await asyncio.gather(*tasks) # 唔同房可以並行;同房會因為 lock 自動排隊
return list(results) # 將每張單嘅處理結果按原次序交返畀 caller
def __init__(self):
self.rooms = {}
self.booking_counter = 0
self.rooms_locks = defaultdict(asyncio.Lock)
book_room(t, "r1", "Alice", 2)
→ 只改 r1 → 鎖 rooms_locks["r1"]
checkout(t, "r1")
→ 只改 r1 → 鎖 rooms_locks["r1"]
upgrade_room(t, "r1", "r2")
→ 改 r1 同 r2 → 鎖兩個
→ sorted(["r1","r2"]) → 先鎖 r1 再鎖 r2
→ 防 deadlock(A 鎖 r1 等 r2,B 鎖 r2 等 r1)
1. lock per ROOM_ID
2. async with lock: 入面 call sync method
3. asyncio.gather(*[...]) 同時跑
4. upgrade sorted lock 防 deadlock
5. return list(results)
同時 send notifications。Failed(room 唔存在或 guest 唔啱)唔 sleep。Fail-fast pattern。
async def send_notifications(self, timestamp, notifications, max_concurrent): # 並發通知住客;假單即刻踢走,真單先入限流閘口
self._process_late_fee(timestamp) # 先收返到期 fee,避免舊狀態影響判斷邊間房仲有人住
sem = asyncio.Semaphore(max_concurrent) # 好似櫃檯同時得 N 個職員可以打電話畀客
# 第一步:先定義單一通知點樣驗身同發送
async def do_notification(t): # 每張通知單都會經過呢個 helper
rid = t["room_id"] # 呢張通知想搵邊個房號
lock = self.rooms_locks[rid] # 攞嗰間房專屬鎖,避免同時有人改房間資料
# 第二步:先喺 lock 入面做 fail-fast 驗證
async with lock: # 先核對清楚住客身份;唔啱就唔好浪費發送 quota
if rid not in self.rooms: # 房號根本唔存在,好似張通知搵錯房門
return False # 即刻當失敗單處理;唔需要 sleep
if self.rooms[rid]["guest_name"] == "": # 間房冇人住,等於房卡都未發出去
return False # 冇目標客人就即走,唔好入發送流程
if self.rooms[rid]["guest_name"] != t["guest_name"]: # 張單上個名同實際住客唔同
return False # 名都對唔上,就當送錯餐,唔好再排隊
# 第三步:驗身成功先入 semaphore,模擬真正發送通知
async with sem: # 合格單先攞到發送位;控制同時幾多個通知出去
await asyncio.sleep(0.01) # 模擬外部通知服務要花時間處理
return True # 呢張通知真係發咗出去;caller 可以當成功送達
# 第四步:將所有通知包成 coroutine,一次過交畀 gather
tasks = [] # 收集成批通知;等陣一齊派出
for t in notifications: # 前台逐張通知單放入發送盤
tasks.append(do_notification(t)) # 每張通知都變成 coroutine;成功與否都保留原次序
results = await asyncio.gather(*tasks) # gather 會等晒所有通知單處理完,途中由 lock + sem 自動控節奏
return list(results) # 按輸入次序交返 True/False;一眼見到邊張通知送到邊張送唔到
失敗嘅 notification 唔 acquire sem,即刻 return False
成功嘅先攞 sem → sleep → return True
3 個 fail 條件(任一 = False):
1. room 唔存在
2. guest_name 唔 match(唔係呢個人住)
3. room 冇人住(guest_name == "")
流程:
lock → check → fail? return False(唔 sleep)
pass? → sem → sleep → return True
Hotel L6 = fail-fast:
check 喺 sem 之前
失敗嘅唔 sleep → 唔佔 sem 位
NF L6 = all-sleep:
全部入 sem → sleep → 先 check
失敗嘅都 sleep → 佔 sem 位
考試 timing test 分得出嚟
notifications = [
{"room_id":"r1","guest_name":"Alice"}, ← r1 有 Alice → pass
{"room_id":"r9","guest_name":"Bob"}, ← r9 唔存在 → fail-fast
{"room_id":"r1","guest_name":"Eve"}, ← guest_name 唔 match → fail-fast
]
max_concurrent = 1
→ [True, False, False]
# r9 同 Eve 即刻 return False(0 秒)
# 只有 Alice 真正 sleep(0.01 秒)
一個 key-field-value database(好似 Redis Hash)。同 Bank 最大分別:nested dict + 唔真刪過期嘅嘢,只係讀嗰陣跳過死嘅。
InMemDB 同 Bank 嘅核心分別:
1. Nested dict(兩層)vs Flat dict(一層)
Bank: self.accounts["alice"]["balance"]
DB: self.data["user1"]["name"]["value"]
2. TTL 用 inline check vs lazy helper
Bank: _process_cashbacks() 每個 method 開頭 call
DB: _is_alive(fd, ts) 喺每次讀嗰陣 inline check
3. L2 return string vs list
Bank: ["alice(500)", "bob(200)"]
DB: "age(30), name(alice)"
4. L6 all-sleep vs fail-fast
Bank: 唔存在嘅唔 sleep
DB: 全部都 sleep
InMemDB 係一個 in-memory key-value store。每個 key 有多個 field,每個 field 有 value 同 optional TTL。好似 Redis 嘅 Hash data type。
Bank InMemDB
─────────────────────── ───────────────────────
self.accounts[id] self.data[key][field]
一層 dict 兩層 dict
cashback → lazy helper TTL → inline _is_alive()
list_accounts → list scan → string
L6 fail-fast L6 all-sleep
backup: deepcopy backup: remaining_ttl
你已經識 Bank。InMemDB 有 3 個唔同:
# Flat dict — 一層
self.accounts["alice"] = {
"balance": 500,
"outgoing": 100
}
# TTL 用 lazy helper
# 每個 method 開頭 call
# 真正 delete 到期嘅嘢
def deposit(self, ts, ...):
self._process_cashbacks(ts)
...
# L2 return list
["alice(500)", "bob(200)"]
# L6 fail-fast
# 失敗嘅唔 sleep
# Nested dict — 兩層
self.data["user1"]["name"] = {
"value": "alice",
"expiry": None
}
# TTL 用 inline check
# 唔真 delete
# 只係 check 嘅時候跳過死嘅
def get_at(self, key, field, ts):
if not self._is_alive(fd, ts):
return ""
# L2 return 一條 string
"age(30), name(alice)"
# L6 all-sleep
# 全部都 sleep
class InMemoryDB:
def __init__(self):
self.data = {} # L1
self.backups = [] # L4 加
self.key_locks = defaultdict(asyncio.Lock) # L5 加
| name | email | session
---------+-------------+-------------+-------------
user1 | alice | a@b.com | abc
user2 | bob | |
| name | email | session
---------+-------------------+--------------------+-------------------
user1 | {"value":"alice", | {"value":"a@b.com",| {"value":"abc",
| "expiry": None} | "expiry": None} | "expiry": 60}
---------+-------------------+--------------------+-------------------
user2 | {"value":"bob", | |
| "expiry": None} | |
# key = 行 field = 欄 每格 = {value, expiry}
# 攞 value:
self.data[key][field]["value"] # → "alice"
# check 死未:
self.data[key][field]["expiry"] # None = 永遠活
# 60 = 到 60 就死
兩層 dict。set 冇 return。get return "" if missing。delete return True/False。
def set(self, key, field, value): # 喺某條大 row 入面開/覆蓋一格;等於喺同一個櫃桶貼新標籤
if key not in self.data: # 第一次見呢條 key;即係個大櫃桶都未開過
self.data[key] = {} # 先開返呢條 row;之後先有地方塞 field
self.data[key][field] = { # 喺呢條 row 入面落返指定欄位;舊值有就直接覆蓋
"value": value, # 真正存嘅內容;之後 get/scan 會由呢度拎貨
"expiry": None # L1 先當永不過期;L3 先至會貼上到期鐘數
}
def get(self, key, field): # 入去指定 row 搵一格貨;搵唔到就用空字串當「架上冇貨」
if key not in self.data: # 連大櫃桶都唔存在;即係呢個人/分類從來未開過
return "" # spec 約定返空字串;caller 一見到就知今次撲空,唔係 None
if field not in self.data[key]: # row 有,但想搵嗰格標籤唔存在
return "" # caller 可以理解成「入咗櫃桶,但嗰個格位仲未貼標籤」
return self.data[key][field]["value"] # 只交返真正內容;唔將 expiry 包裝一齊拎畀外面
def delete(self, key, field): # 拆走某條 row 入面其中一格;拆到清晒就連成條 row 都收埋
if key not in self.data: # 連大櫃桶都冇;代表根本冇得拆
return False # False = 今次拆貨要求冇落到地;caller 可以當目標不存在
if field not in self.data[key]: # 有 row,但指定格位本身唔喺度
return False # 一樣當做撲空;唔好幻想已經刪到任何嘢
del self.data[key][field] # 真正拆走呢格標籤同內容;之後 get 就再搵唔到
if not self.data[key]: # 如果成條 row 已經冇剩任何小格
del self.data[key] # 連個空櫃桶都收埋,避免留低一條假殼
return True # True = 今次真係拆中咗一格貨;caller 可以放心當成功
def __init__(self):
self.data = {} # key → field → {value, expiry}
self.backups = {} # L4 加
self.key_locks = defaultdict(asyncio.Lock) # L5 加
self.data = { 主資料庫(key → field → {value, expiry})
"user1": {
"name": {"value": "alice", "expiry": None},
"email": {"value": "a@b.com", "expiry": None},
},
"user2": {
"name": {"value": "bob", "expiry": None},
},
}
self.backups = [] 備份 list(L4 先加)
self.key_locks = defaultdict(asyncio.Lock) per-key 鎖(L5 先加)
冇 helper
CRUD 三個 method 自己寫,唔抽 helper
Return 一條 string(唔係 list!)。sorted by field name alphabetically。format: "field(value), field(value)"
def scan(self, key): # 巡一整條 row,將所有格位按字母排好,一次過讀畀你聽
if key not in self.data: # 呢條大 row 根本唔存在;即係連個櫃桶都未開
return "" # 返空字串 = 呢條 row 冇任何可報內容;caller 唔好當成 list
fields = sorted(self.data[key]) # 先將所有欄名排字母序;等輸出穩定,考試先唔會亂序
if not fields: # 有 row 殼但入面已經冇任何 field
return "" # 一樣回空字串;對 caller 嚟講等於成條 row 係空架
parts = [] # 呢度似先開張點貨清單,逐格寫落去
for f in fields: # 逐個格位巡;將「欄名 + 內容」砌成 display 用字串
parts.append(f"{f}({self.data[key][f]['value']})") # 每格都砌成 `field(value)`;等讀者一眼見到標籤同內容
return ", ".join(parts) # 最後用逗號串成一句;外面收到嘅就係完整點貨報告
def scan_by_prefix(self, key, prefix): # 只巡某一類標籤開頭嘅格位;等於只睇同一排貨架
if key not in self.data: # 連大 row 都唔存在,當然冇得按分類去巡
return "" # 返空字串 = 呢個櫃桶入面冇任何符合條件嘅貨畀你睇
fields = [] # 呢度只收集前綴吻合嘅欄名;其他一概唔報
for f in self.data[key]: # 逐個 field 掃;好似逐個標籤睇係咪同一個貨區
if f.startswith(prefix): # 標籤啱 prefix 先留低;唔相關嘅貨架直接跳過
fields.append(f) # 呢格合資格,放入候選名單,等陣先統一排序
fields.sort() # 候選名單按字母排好;輸出先穩定同易對答案
if not fields: # 巡完都冇中;代表呢個分類暫時一件貨都冇
return "" # caller 可以當成「呢排貨架而家係空嘅」
parts = [] # 準備砌最終報告;每格都會變成 `field(value)`
for f in fields: # 逐個入圍欄位砌結果句子
parts.append(f"{f}({self.data[key][f]['value']})") # 保留欄名同內容一齊報,外面先知係邊類貨
return ", ".join(parts) # 最終交返一條逗號句;等於將成排符合條件嘅格位一次過讀出嚟
# table 版:
| addr_city | addr_zip | name
---------+-----------+----------+--------
user1 | NYC | 10001 | alice
# dict 版:
self.data = {
"user1": {
"addr_city": {"value": "NYC", "expiry": None},
"addr_zip": {"value": "10001", "expiry": None},
"name": {"value": "alice", "expiry": None},
}
}
# self.data["user1"] 係:
# {"addr_city": {...}, "addr_zip": {...}, "name": {...}}
sorted(self.data["user1"])
# loop dict = loop keys,所以等於攞晒所有欄名再排序
# → ["addr_city", "addr_zip", "name"]
# (呢個例子本身已經啱順序)
# 如果欄名係 ["name", "addr_city", "addr_zip"]:
# sorted → ["addr_city", "addr_zip", "name"]
# a 排先,n 排尾
fields = ["addr_city", "addr_zip", "name"]
# for f in fields → 逐個欄名行一次
# 每次砌一個 string:f"{f}({self.data[key][f]['value']})"
# 第一次 f = "addr_city":
# self.data["user1"]["addr_city"]["value"] → "NYC"
# 砌出嚟:"addr_city(NYC)"
# 第二次 f = "addr_zip":
# self.data["user1"]["addr_zip"]["value"] → "10001"
# 砌出嚟:"addr_zip(10001)"
# 第三次 f = "name":
# self.data["user1"]["name"]["value"] → "alice"
# 砌出嚟:"name(alice)"
# parts = ["addr_city(NYC)", "addr_zip(10001)", "name(alice)"]
# ", ".join(parts) → 用逗號接埋
→ "addr_city(NYC), addr_zip(10001), name(alice)"
# 去 data["user1"] 攞所有欄名
# 逐個 check 係咪以 prefix 開頭
# 係嘅留低,唔係嘅丟走
# 留低嗰啲按字母排序
# prefix = "addr_"
# 所有欄名:["addr_city", "addr_zip", "name"]
# addr_city → 以 "addr_" 開頭?係 → 留
# addr_zip → 以 "addr_" 開頭?係 → 留
# name → 以 "addr_" 開頭?唔係 → 丟
# sorted → ["addr_city", "addr_zip"]
→ "addr_city(NYC), addr_zip(10001)"
唔係 lazy helper(唔好同 Bank _process_cashbacks 撈亂)。呢個係 inline check:每次讀 field 前先問「呢格仲活唔活?」。過期嘅唔真刪,只係跳過。
# Helper: _is_alive — 呢個 field 而家仲活唔活?
def _is_alive(self, field_data, timestamp): # 守門員只答一件事:呢格此刻仲畀唔畀外面當「活住」咁用
if field_data["expiry"] is None: # 冇 TTL → 永遠活
return True # True 喺呢度唔係「成功」;而係代表 caller 可以當呢格仲見得人
if timestamp < field_data["expiry"]: # 未到期 → 仲活(注意係 < 唔係 <=)
return True # 仲未過鐘,就繼續畀外面讀到;等於張通行證未過期
return False # 一到鐘就當透明;資料未必真刪,但外面要當佢死咗
_is_alive(field_data, timestamp)
field_data = self.data[key][field](一格)
check 呢格仲活唔活
None → 永遠活
timestamp < expiry → 仲活
timestamp >= expiry → 死咗
Bank/Hotel/FS:
_process_X(timestamp)
所有需要見到「最新 active set」嘅 method 先 refresh;最常見係 public method 開頭,但如果 spec 另有明確 cleanup API,就由嗰個 API 觸發
行晒所有 item,到期嘅就做嘢(刪/升級/收錢)
係「主動清理」
InMemDB:
_is_alive(field_data, timestamp)
每次讀某個 field 前 call
只 check 呢一格,唔改 data
係「被動跳過」(唔真刪,只係裝睇唔到)
set_at_with_ttl("u1", "s", "abc", 10, 50)
→ expiry = 10 + 50 = 60
timestamp = 59: 59 < 60 → alive ✅
timestamp = 60: 60 < 60? NO → dead ❌
timestamp = 61: 61 < 60? NO → dead ❌
# 用 < 唔用 <=
# 即係 timestamp == expiry 已經算死
# 考試會 test 呢個 boundary
唔用 lazy helper。用 inline _is_alive check。過期嘅唔真刪,只係跳過。
def set_at(self, key, field, value, timestamp): # L3 版普通入格;timestamp 只係接口要收,實際唔用嚟計生死
if key not in self.data: # 呢條 row 未開過;即係先要開個大櫃桶
self.data[key] = {} # 開新 row;之後先有位落 field
self.data[key][field] = {"value": value, "expiry": None} # 仍然當永久貨處理;只係將 L3 介面對齊返 spec
def set_at_with_ttl(self, key, field, value, timestamp, ttl): # 寫入一格順手貼到期鐘;之後讀嗰刻先臨場驗身
if key not in self.data: # 大 row 未存在;先開櫃桶先有得貼期限貨
self.data[key] = {} # 開新 row;TTL 只影響入面小格,唔影響外層 key
expiry = timestamp + ttl # 先算死線;等於記低件貨幾點開始唔再見客
self.data[key][field] = {"value": value, "expiry": expiry} # 將內容同死線一齊落格;之後 `_is_alive` 會幫你守門
def get_at(self, key, field, timestamp): # 喺指定時刻開櫃桶睇一格貨;死咗就當場對 caller 話「而家見唔到」
if key not in self.data: # 連外層 row 都冇;即係呢個櫃桶從來未存在
return "" # 返空字串;caller 可以直接理解成今個時間點完全冇貨
if field not in self.data[key]: # 櫃桶有開,但指定格位冇貼過標籤
return "" # 仍然當撲空;外面唔需要分清係「冇格」定「過期」
fd = self.data[key][field] # 先攞整格資料;因為要連 expiry 一齊驗身
if not self._is_alive(fd, timestamp): # 守門員話已死,就算資料仲擺喺度都要當冇
return "" # caller 收到空字串,就知呢格喺呢一刻唔應該再被看見
return fd["value"] # 守門通過先交真貨;對外永遠只露 value,唔露 expiry 細節
def delete_at(self, key, field, timestamp): # 喺指定時刻嘗試拆貨;但只可以拆仍然「活住」嗰格
if key not in self.data: # 連外層 row 都冇,等於搵錯櫃桶
return False # False = 今次拆貨要求根本冇命中任何有效目標
if field not in self.data[key]: # row 有,但嗰個格位本身唔存在
return False # 一樣當作撲空;唔應該假裝刪到任何嘢
if not self._is_alive(self.data[key][field], timestamp): # 呢格已經過鐘?
return False # 過期貨視同不存在;唔畀你再靠 delete 刪一件死貨刷成功
del self.data[key][field] # 真正拆走活住嗰格;之後同一時間再讀都會撲空
if not self.data[key]: # 如果拆完之後成條 row 已經空晒
del self.data[key] # 連大櫃桶一齊收埋;避免留低一個空殼 key
return True # True = 今次真係拆到一件仲活住嘅貨;caller 可以放心記成功
def scan_at(self, key, timestamp): # 喺某個時間點巡一整條 row;只報仍然活住嗰批格位
if key not in self.data: # 連櫃桶都唔存在,冇得巡樓
return "" # 返空字串;caller 可以當成呢條 row 此刻完全冇畫面
fields = [] # 呢度只收集仲活住嘅格位;死貨一律唔報
for f in self.data[key]: # 逐格驗身;等於巡同一排貨架睇邊件仲可以上架
if self._is_alive(self.data[key][f], timestamp): # 守門通過先計入報告;過期貨當透明
fields.append(f) # 呢格仲活,就記低欄名;等陣先統一排序輸出
fields.sort() # 活貨名單按字母排;答案先穩定一致
if not fields: # 巡完發現一件活貨都冇
return "" # 對 caller 嚟講就等於呢條 row 喺呢刻已經空晒
parts = [] # 開張存活報告;每項都會報出標籤加內容
for f in fields: # 逐個活格砌成 `field(value)`
parts.append(f"{f}({self.data[key][f]['value']})") # 讀者一眼就知邊個格位仲活,同入面裝住乜
return ", ".join(parts) # 最後交返一條存活清單;等於巡樓後嘅點貨報告
def scan_by_prefix_at(self, key, prefix, timestamp): # 喺指定時刻只巡某類貨架;prefix 要啱、而且件貨仲要未過期
if key not in self.data: # 連外層 row 都冇;即係連貨架區都未開
return "" # 對 caller 嚟講就係「今個時間冇任何符合條件嘅貨」
fields = [] # 只收同 prefix 同時仲活住嘅格位;兩個條件都要過
for f in self.data[key]: # 逐格巡;先睇標籤,再睇生死
if f.startswith(prefix): # 標籤屬於同一排貨架先有資格入圍
if self._is_alive(self.data[key][f], timestamp): # 守門通過,代表件貨喺呢刻真係仲見得人
fields.append(f) # 符合 prefix 又仲活住,先記入最後清單
fields.sort() # 入圍格位按字母排;令輸出固定,方便對答案
if not fields: # 巡完都冇貨入圍
return "" # caller 一見空字串,就知呢排貨架此刻係空嘅
parts = [] # 準備最後報告;每項都會寫出欄名加內容
for f in fields: # 逐個入圍格位砌句子;等輸出一眼睇得明
parts.append(f"{f}({self.data[key][f]['value']})") # 保留 `field(value)` 形狀,讀者先會知邊類貨同內容係乜
return ", ".join(parts) # 最後交返一條「分類 + 存活」雙重過濾後嘅巡樓報告
set_at("u1", "name", "alice", 10)
set_at_with_ttl("u1", "session", "s1", 10, 50)
set_at_with_ttl("u1", "cache", "c1", 10, 20)
| name | session | cache
---------+---------------------+---------------------+---------------------
u1 | {"value": "alice", | {"value": "s1", | {"value": "c1",
| "expiry": None} | "expiry": 60} | "expiry": 30}
| 永遠活 | 10 + 50 = 60 | 10 + 20 = 30
時間: 10 --- 25 --- 30 --- 35 --- 60 --- 999
name: ✅ ✅ ✅ ✅ ✅ ✅
session: ✅ ✅ ✅ ✅ ❌ ❌
cache: ✅ ✅ ❌ ❌ ❌ ❌
↑ 30 < 30? No → dead
get_at("u1","session",59) → 59<60 → alive → "s1"
get_at("u1","session",60) → 60<60 → NO → dead → ""
get_at("u1","name",999) → expiry=None → alive → "alice"
cache:25<30 ✅ name:None ✅ session:25<60 ✅
→ "cache(c1), name(alice), session(s1)"
cache:35<30? ❌ DEAD name:✅ session:35<60 ✅
→ "name(alice), session(s1)"
cache: 35<30? NO → DEAD → return False
# 已死嘅 field 你 delete 唔到
backup 存 remaining_ttl(唔係 expiry!),restore 用 remaining_ttl 重新計 expiry
# key=行 field=欄 fd=格 timestamp=而家幾點
def backup(self, timestamp): # 影一張 snapshot(deepcopy)
snapshot = {} # 張相(空)
count = 0 # 數幾多行有活嘅格
for key, fields in self.data.items(): # 逐行行
alive_fields = {} # 呢行入面活嘅格
for field, fd in fields.items(): # 逐格行
if self._is_alive(fd, timestamp): # 呢格仲活?
remaining = None # 默認冇 TTL
if fd["expiry"] is not None: # 有 TTL?
remaining = fd["expiry"] - timestamp # 計仲剩幾耐
alive_fields[field] = { # 影低呢格
"value": fd["value"], # 存值
"remaining_ttl": remaining # 存仲剩幾耐
}
if alive_fields: # 呢行有活嘅格?
snapshot[key] = alive_fields # 放入相
count += 1 # 數多一行
self.backups.append((timestamp, snapshot)) # 存張相
return count # 影咗幾多行
# key=行 field=欄 fd=格 timestamp=而家幾點
def restore(self, timestamp, backup_timestamp): # 還原到某個 snapshot
best = None # 記住最近嗰張相
for ts, snap in self.backups: # 逐張相睇
if ts <= backup_timestamp: # 呢張喺目標時間或之前?
if best is None or ts > best[0]: # 係最近嗰張?
best = (ts, snap) # 記住
if best is None: # 搵唔到任何相
return "" # 冇得還原
_, snapshot = best # 攞張相出嚟
self.data = {} # 清空成個 database
count = 0 # 計數器歸零
for key, fields in snapshot.items(): # 逐行重建
self.data[key] = {} # 開返呢行
for field, fd in fields.items(): # 逐格重建
expiry = None # 默認冇 TTL
if fd["remaining_ttl"] is not None: # 有 TTL?
expiry = timestamp + fd["remaining_ttl"] # 重算幾時死
self.data[key][field] = { # 放返入 database
"value": fd["value"], # 抄返個值
"expiry": expiry # 放返 expiry
}
if self.data[key]: # 呢行有格?
count += 1 # 數多一行
return str(count) # return string!"1" 唔係 1
| name | session | cache
---------+---------------------+---------------------+---------------------
u1 | {"value": "alice", | {"value": "s1", | {"value": "c1",
| "expiry": None} | "expiry": 60} | "expiry": 30}
| 永遠活 | 仲剩 60-40=20 | 30<40 → 已死!唔影
| name | session
---------+----------------------------+----------------------------
u1 | {"value": "alice", | {"value": "s1",
| "remaining_ttl": None} | "remaining_ttl": 20}
| 冇 TTL → None | expiry(60) - ts(40) = 20
# cache 已死,唔入 snapshot
# return 1(得 1 行有活 field)
# 搵 backup:ts=40 <= 40 ✅ → 用呢個
# 清空成個 database
# 逐個 field 重建,remaining_ttl → new expiry
| name | session
---------+---------------------+---------------------
u1 | {"value": "alice", | {"value": "s1",
| "expiry": None} | "expiry": 220}
| 冇 TTL | ts(200) + remaining(20) = 220
# session 又可以再活 20 個 time unit
# return "1"(string!唔係 int)
# 搵唔到 backup → return ""
# 有 TTL → 用上面嘅 template(remaining_ttl 版)
# 冇 TTL → 用 deepcopy 一行版(更簡單):
def backup(self, timestamp):
state = copy.deepcopy(self.data)
self.backups.append((timestamp, state))
return len(self.data)
def restore(self, timestamp, backup_timestamp):
best = None
for ts, state in self.backups:
if ts <= backup_timestamp:
if best is None or ts > best[0]:
best = (ts, state)
if best is None:
return False
self.data = copy.deepcopy(best[1])
return True
# 兩個都照抄改 variable name 就得
同 Bank L5 一模一樣嘅 pattern。lock per key。set return None。
async def batch_operations(self, operations): # 批量操作(lock per key + gather)
async def execute_op(op): # 處理單一 op(async inner function)
key = op.get("key", "") # 攞邊行
lock = self.key_locks[key] # 攞嗰行嘅鎖
async with lock: # 鎖住呢條 key
if op["type"] == "set": # 係 set 操作?
self.set(op["key"], op["field"], op["value"]) # 真正寫入交返 L1 個 set 處理;即係沿用原本嗰套落格規矩
return None # set 冇 return value
elif op["type"] == "get": # 係 get 操作?
return self.get(op["key"], op["field"]) # 真正讀值交返 L1 個 get;即係照返最基本讀格規矩攞答案
elif op["type"] == "delete": # 係 delete 操作?
return self.delete(op["key"], op["field"]) # 真正刪除交返 L1 個 delete;等佢按原本規矩處理清走
elif op["type"] == "scan": # 係 scan 操作?
return self.scan(op["key"]) # 真正掃描交返 L2 個 scan;即係沿用原本排序同格式規矩
return None # 未知 type
tasks = [] # 收集所有 coroutine
for op in operations:
tasks.append(execute_op(op))
results = await asyncio.gather(*tasks)
return list(results) # gather 返嚟嗰批結果轉成普通 list;caller 之後比較易直接用
一次過收好多個操作(set、get、delete、scan)
要同時跑晒佢哋
點解要 lock:
兩個操作同時改同一行(同一個 key)會亂
例如兩個人同時 set user1 嘅 name,唔知邊個贏
所以每個 key 有一把鎖,改嗰陣鎖住,其他人等
做法(每個 mock 都一樣):
1. 收一個 list of operations
2. 每個 operation 攞個 key,鎖住嗰行
3. 根據 type call 你 L1/L2 已經寫好嘅 method
4. asyncio.gather 將全部結果收埋
5. return list of results
L5 只係包一層 lock + gather 喺外面
入面直接 call 返你嘅 method
ops = [
{"type":"set","key":"u1","field":"name","value":"alice"},
{"type":"get","key":"u1","field":"name"},
{"type":"delete","key":"u2","field":"x"},
]
→ [None, "alice", False]
1. lock per KEY(唔係 per field)
2. async with lock: 入面 call sync method
3. asyncio.gather(*[...]) 同時跑
4. set return None
All-sleep — 全部 key 都 sleep,冇 fail-fast。同 Bank L6 唔同!
async def batch_scan(self, keys, max_concurrent): # 批量 scan(semaphore)
sem = asyncio.Semaphore(max_concurrent) # 准考證,限制同時幾個
async def scan_key(key): # 掃描某個 key 嘅所有 field
async with sem: # 全部都攞 sem
await asyncio.sleep(0.01) # 全部都 sleep
return (key, self.scan(key)) # 返 (key, scan結果) tuple
tasks = [] # 收集所有 coroutine
for k in keys:
tasks.append(scan_key(k))
results = await asyncio.gather(*tasks)
return dict(results) # 呢度刻意返 dict;caller 會直接見到 key 對應結果,唔使再自己砌
# ── 如果 spec 講 "skip missing" → 改做 fail-fast ──
async def batch_scan_fail_fast(self, keys, max_concurrent): # 批量 scan(fail-fast pattern)
sem = asyncio.Semaphore(max_concurrent) # 准考證,限制同時幾個
async def scan_key(key): # 掃描某個 key 嘅所有 field
if key not in self.data: # 先 check
return (key, "") # 即走,唔 sleep
async with sem: # 存在先入 sem
await asyncio.sleep(0.01) # 存在先 sleep
return (key, self.scan(key)) # 返 (key, scan結果) tuple
tasks = []
for k in keys:
tasks.append(scan_key(k))
results = await asyncio.gather(*tasks)
return dict(results) # 將計好嘅值交返 caller;之後外面會靠呢個結果再判斷
Bank L6 (fail-fast):
check 喺 sem 之前
失敗 → return False,唔 sleep
InMemDB L6 (all-sleep):
冇 check,全部 sleep
key 唔存在?scan return "" → 照 sleep
keys = ["u1", "u2", "nokey"]
max_concurrent = 2
# 第一輪(同時 2 個): u1 + u2 → sleep
# 第二輪: nokey → sleep
→ {"u1":"name(alice)","u2":"name(bob)","nokey":""}
# 呢度刻意返 dict;caller 會直接見到 key 對應結果,唔使再自己砌
keys = ["u1", "nokey", "nokey2", "u2"], max_concurrent=1
all-sleep(上面嘅 code):
u1 → sleep 0.01 → "name(alice)"
nokey → sleep 0.01 → ""
nokey2 → sleep 0.01 → ""
u2 → sleep 0.01 → "name(bob)"
total: 0.04 秒(4 個都 sleep)
fail-fast(呢度嘅 code):
u1 → sleep 0.01 → "name(alice)"
nokey → 即 return "" → 0 秒
nokey2 → 即 return "" → 0 秒
u2 → sleep 0.01 → "name(bob)"
total: 0.02 秒(只有 2 個 sleep)
考試 timing test 會 check 呢個時間差
想像一個圓圈(ring),位置由 0 到 99。Nodes 放喺上面,keys route 去最近嘅 node。
# Hash 函數(spec 強制用呢個)
import hashlib
def _hash(s): # 字串 → 0-99 嘅 position
return int(hashlib.md5(s.encode()).hexdigest(), 16) % 100 # 字串 → 0-99 嘅位置
# 任何 string → 0-99 嘅位置
# _hash("node_a") → 例如 20
# _hash("key1") → 例如 10
# hash ring = 一條圓形嘅街,100 格(0-99)
# node = 舖頭(固定喺某格)
# key = 客人(企喺某格,要搵最近嘅舖頭)
# _hash() = 一部地址分配機(黑盒)
# 你畀個名入去 → 佢吐個 0-99 嘅數字出嚟 → 呢個就係地址
# 你控制唔到出咩數字,但同一個名永遠出同一個數字
# 點解用 hash?因為你唔想自己揀地址
# 如果自己揀,可能全部舖頭擠埋一齊
# 用 hash 就會隨機分散喺成條街
# 考試會畀呢個 function 你,照用
_hash("node_a") → 20 # node_a 間舖開喺 20 號
_hash("node_b") → 60 # node_b 間舖開喺 60 號
_hash("key1") → 10 # key1 企喺 10 號
_hash("key2") → 30 # key2 企喺 30 號
_hash("key3") → 80 # key3 企喺 80 號
# 條街嘅樣(大部分格空嘅):
格 0 → 空
...
格 10 → key1(客人)
...
格 20 → node_a(舖頭)
...
格 30 → key2(客人)
...
格 60 → node_b(舖頭)
...
格 80 → key3(客人)
...
格 99 → 空 → wrap 返格 0
# 每個客人順時針行,第一間舖就係佢嘅:
# key1(10) → 行 → 20 有 node_a → 入去 ✅
# key2(30) → 行 → 60 有 node_b → 入去 ✅
# key3(80) → 行 → 99...0...20 有 node_a → 入去 ✅(繞咗一圈)
import asyncio
import hashlib
from collections import defaultdict
class HashRing:
def __init__(self):
self.nodes = {} # L1 node_id → {replicas, positions}
self.keys = {} # L2 key → node_id(邊條 key 喺邊個 node)
self.key_access = {} # L4 key → timestamp(LRU 用)
self.capacities = {} # L4 node_id → max capacity
self.key_locks = defaultdict(asyncio.Lock) # L5
self.nodes = { 舖頭名冊(key = 舖名)
"node_a": {
"replicas": 1, node_a 得 1 間分店
"positions": [20], 喺條街 20 號
},
"node_b": {
"replicas": 3, node_b 有 3 間分店
"positions": [60, 35, 88], 分別喺 60、35、88 號
},
}
self.keys = { 客人住邊間(key = 客人名,value = 舖名)
"key1": "node_a", 客人 key1 住喺 node_a
"key2": "node_b", 客人 key2 住喺 node_b
"key3": "node_a", 客人 key3 都住 node_a
}
self.key_access = { 客人最後出現時間(踢人嗰陣搵最細 = 最舊 = 最先被踢)
"key1": 10, key1 最後喺 time=10 出現
"key2": 20, key2 最後喺 time=20 出現
}
self.capacities = { 舖頭最多住幾多個客(L4 先用)
"node_a": 5, node_a 最多 5 個客
"node_b": 3, node_b 最多 3 個客
} 冇設過嘅舖 → .get(node, -1) → -1 = 無限
# Helper 1: _ring — 砌成條圓形街嘅地圖(按位置排序好嘅 list)
def _ring(self): # 砌返條街嘅地圖(sorted list)
items = [] # 開個空 list 等住裝 (position, node_id) tuple
for node_id, info in self.nodes.items(): # 逐間舖睇
for pos in info["positions"]: # 一間舖可能有 N 個 position(L3 replica)
items.append((pos, node_id)) # 每個位置都登記做 (位置, 舖名)
items.sort() # 按 position 升序排(tuple 默認比較第 0 個 element)
return items # 條已排序嘅街
# Helper 2: _route — 由客人嘅位置順時針搵第一間舖
def _route(self, key): # 順時針搵最近嘅舖
ring = self._ring() # 攞最新嘅街地圖
if not ring: # 條街冇舖開緊
return None # 冇得 route
pos = _hash(key) # 計客人企喺邊個位置
for ring_pos, node_id in ring: # 由細位置開始睇
if ring_pos >= pos: # 順時針第一間 >= 客人位置嘅就係佢
return node_id # 搵到啦,答返舖名
return ring[0][1] # 行完都搵唔到 → wrap 返條街最開頭嗰間
# Helper 3: _reassign — 條街變咗,所有客人重新搵舖
def _reassign(self): # 所有 request 重新派
new_keys = {} # 整個新 dict 過渡(防止 iterate 緊 modify)
for key in list(self.keys): # 行每個現有客人
node = self._route(key) # 用最新嘅街重新計
if node is not None: # 條街仲有舖
new_keys[key] = node # 更新客人住嗰間
self.keys = new_keys # 用新 dict 取代舊嘅
_ring()
return [(pos, node_id), ...] sorted by pos
每個 node 嘅 N 個 positions 全部 flatten 入嚟
_route(key)
用 _hash(key) 計位置
順時針搵第一間 ring_pos >= pos 嘅舖
搵唔到就 wrap 返 ring[0]
_reassign()
add_node / remove_node 之後 call
所有現有 key 都用最新嘅街重新 route
node = 舖頭 key = 客人 ring = 圓形街 position = 地址 route = 搵最近嘅舖
def add_node(self, timestamp, node_id): # 條街多開一間舖
if node_id in self.nodes: # 同名舖已經存在?
return False # 重複 → 拒
self.nodes[node_id] = { # 登記新舖入 dict
"replicas": 1, # L1 一間舖 = 一個 position(L3 先有 replica)
"positions": [_hash(node_id)] # 將舖名 hash → 條街上嘅地址
}
self._reassign() # 條街變咗 → 所有客人重新搵最近嘅舖
return True # 呢度返 True,代表今次動作真係成功落地
def remove_node(self, timestamp, node_id): # 執舖,客人順時針搬去下一間
if node_id not in self.nodes: # 冇開過呢間舖 → 拒
return False # 呢度返 False,代表今次想做嘅動作其實冇落到地
del self.nodes[node_id] # 由 dict 移走
self._reassign() # 條街變咗 → 所有客人重新派
return True # 呢度返 True,代表今次動作真係成功落地
def route_key(self, timestamp, key): # 畀個 key,揾應該擺去邊間舖(唔記錄)
return self._route(key) # helper 順時針搵:return 舖名 / None
def get_node_count(self, timestamp): # 幾多間舖喺度
return len(self.nodes) # dict 嘅 key 數
def __init__(self):
self.nodes = {} 舖頭名冊:舖名 → {分店數, 位置 list}
self.keys = {} 客人住邊間:客人名 → 舖名(L2 加)
self.key_access = {} 客人最後出現:客人名 → timestamp(L4 踢人用)
self.node_locks = defaultdict(asyncio.Lock) L5 用
self.nodes = { 舖頭名冊(key = 舖名)
"node_a": {
"replicas": 1, node_a 得 1 間分店
"positions": [20], 喺條街 20 號
},
"node_b": {
"replicas": 1,
"positions": [60],
},
}
self.keys = {} L1 仲未有客人入住(L2 先加)
self.key_access = {} L1 仲未有(L4 先加)
self.capacities = {} L1 仲未有(L4 先加)
_hash(s) 字串 → 0-99 嘅 position
_ring() 砌 [(pos, node), ...] sorted by pos
_route(key) 順時針搵第一間 ≥ 客人位置嘅舖
_reassign() 條街變動之後,全部 key 重新派
node = 舖頭 key = 客人 store = 客人入住 load = 幾多個客人
def store_key(self, timestamp, key): # 客人入住最近嘅舖(route + 記低)
node = self._route(key) # 用 helper 順時針搵最近嘅舖
if node is None: # 條街冇舖開緊
return None # 直接 return None,唔記低
self.keys[key] = node # 記低呢個客住喺呢間舖
self.key_access[key] = timestamp # 順手記今次 access 時間(L4 LRU 用)
return node # 返舖名畀 caller
def list_nodes(self, timestamp): # 列出全部舖名,字母順序
return sorted(self.nodes) # dict iterate = key,直接 sorted
def get_load(self, timestamp, node_id): # 呢間舖有幾多個客人
if node_id not in self.nodes: # 冇開過呢間舖
return 0 # 冇舖 → 0 個客
count = 0 # 從 0 計起
for n in self.keys.values(): # 行 keys dict 嘅 value(即係住喺邊間舖)
if n == node_id: # value 等於目標舖名
count += 1 # 一個客 +1
return count # 總人數
def top_loaded(self, timestamp, n): # 全部舖按客人數排名前 n
loads = {} # dict:舖名 → 客人數
for nid in self.nodes: # 每間舖先填 0(防止 0 客嘅舖漏咗)
loads[nid] = 0 # 每間舖都先填 0
for node_id in self.keys.values(): # 行 keys 嘅 value 數客
if node_id in loads: # 防呆:避免 stale routing
loads[node_id] += 1 # 加一個客
items = sorted(loads.items(), key=lambda x: (-x[1], x[0])) # 多嘅排先;tie 按舖名升序
result = [] # 開個空 list 砌結果
for nid, load in items[:n]: # 攞前 n 個砌字串
result.append(f"{nid}({load})") # 格式 "node_a(2)"
return result # 將整理好嘅結果交返 caller;外面就拎住呢份清單或報表去用
def __init__(self):
self.nodes = {} 舖頭名冊:舖名 → {分店數, 位置 list}
self.keys = {} 客人住邊間:客人名 → 舖名(L2 加)
self.key_access = {} 客人最後出現:客人名 → timestamp(L4 踢人用)
self.node_locks = defaultdict(asyncio.Lock) L5 用
self.nodes = { 舖頭名冊
"node_a": {
"replicas": 1,
"positions": [20],
},
"node_b": {
"replicas": 1,
"positions": [60],
},
}
self.keys = { 客人住邊間(L2 新加)
"key1": "node_a", 客人 key1 住喺 node_a
"key2": "node_b", 客人 key2 住喺 node_b
"key3": "node_a", 客人 key3 都住 node_a
}
self.key_access = { 客人最後出現時間(L2 開始記,L4 踢人用)
"key1": 5,
"key2": 6,
"key3": 7,
}
self.capacities = {} L2 仲未有(L4 先加)
_route(key) store_key 用嚟搵舖
# 其他 sorting / count 全部 inline 寫
node = 舖頭 replica = 分店 一間舖可以開幾間分店喺唔同位置
def add_node_with_replicas(self, timestamp, node_id, num_replicas): # 開舖,同時開 N 間分店
if node_id in self.nodes: # 同名舖已經開過
return False # 呢度返 False,代表今次想做嘅動作其實冇落到地
positions = [] # 開個 list 裝 N 個分店地址
for i in range(num_replicas): # 第 i 個 replica
positions.append(_hash(f"{node_id}_{i}")) # 用「舖名_i」hash → 分店地址
self.nodes[node_id] = { # 登記呢間舖
"replicas": num_replicas, # 記低有幾多分店(query 用)
"positions": positions # N 個地址放埋一齊
}
self._reassign() # 條街多咗 N 個位置 → 全部客人重新派
return True # 呢度返 True,代表今次動作真係成功落地
def get_replica_count(self, timestamp, node_id): # 呢間舖開咗幾多間分店
if node_id not in self.nodes: # 冇開過呢間舖
return 0 # 答 0
return self.nodes[node_id]["replicas"] # 由 dict 攞返
def __init__(self):
self.nodes = {}
self.keys = {}
self.key_access = {}
self.node_locks = defaultdict(asyncio.Lock)
self.nodes = { 舖頭名冊
"node_a": {
"replicas": 1, 得 1 間分店
"positions": [20],
},
"node_c": {
"replicas": 3, 有 3 間分店
"positions": [15, 42, 78], 分別喺 15、42、78 號
},
}
self.keys = { 同 L2 一樣
"key1": "node_a",
"key2": "node_c", 可能 reassign 去咗 node_c(因為多咗分店)
}
self.key_access = { 同 L2 一樣
"key1": 5,
"key2": 6,
}
self.capacities = {} L3 仲未有(L4 先加)
_hash(s) 每個分店地址都靠呢個
_reassign() 開完 N 個位置之後 rebalance
node = 舖頭 capacity = 舖頭最多住幾多人 evict LRU = 踢走最耐冇出現嘅客人
Each node has a memory capacity — the maximum number of keys it can store. // 每個 node 有 capacity(最多幾多 key)
When a node is full and a new key needs to be stored, the least recently used (LRU) key is evicted. // 滿咗就踢走最舊 access 嗰條
set_capacity(timestamp, node_id, capacity) sets the max keys a node can hold. // 設 capacity
Returns False if node does not exist. Default capacity is unlimited. // 唔存在 → False,預設無限
get_capacity(timestamp, node_id) returns the capacity. // 攞 capacity
Returns 0 if node does not exist. Returns -1 for unlimited. // 唔存在 → 0,無限 → -1
get_used(timestamp, node_id) returns how many slots are currently used. // 用咗幾多
Returns 0 if node does not exist. // 唔存在 → 0
evict_lru(timestamp, node_id) manually evicts the least recently accessed key. // 手動踢走最舊嘅 key
Returns None if no keys to evict. Otherwise returns the evicted key name. // 冇 key → None,有 → return key 名
store_key behavior changes: if the target node is full and the key is not already on that node, // store 時如果滿咗
evict the LRU key first, then store the new key. // 先踢走最舊嘅,再放新嘅
Re-storing the same key refreshes its last_access timestamp. // 同一條 key 再 store → refresh 時間
def set_capacity(self, timestamp, node_id, capacity): # 設舖頭最多住幾多人
if node_id not in self.nodes: # 冇開過呢間舖 → 拒
return False # 呢度返 False,代表今次想做嘅動作其實冇落到地
self.capacities[node_id] = capacity # 記低呢間舖嘅 capacity(個數)
return True # 呢度返 True,代表今次動作真係成功落地
def get_capacity(self, timestamp, node_id): # 攞返呢間舖嘅 capacity
if node_id not in self.nodes: # 舖唔存在
return 0 # spec 約定:唔存在 return 0
return self.capacities.get(node_id, -1) # 冇設過 → -1 代表無限
def evict_lru(self, timestamp, node_id): # 手動踢走呢間舖最舊 access 嘅客人
if node_id not in self.nodes: # 舖唔存在
return None # 答 None
# 第一步:搵晒住喺呢間舖嘅 key(即係 keys dict 入面 value == node_id)
candidates = [] # 收集住喺呢間嘅 key
for k, n in self.keys.items(): # 逐個 key 行
if n == node_id: # 住喺目標 node?
candidates.append(k) # 留低呢個 key
if not candidates: # 間舖一個客都冇 → 冇得踢
return None # 冇得踢
# 第二步:喺 candidates 入面搵 key_access timestamp 最細嗰個
lru = None # 等住記最舊嗰個
lru_time = None # 等住記最舊時間
for k in candidates: # 逐個 candidate 比
access_time = self.key_access[k] # 攞佢嘅最後 access 時間
if lru is None or access_time < lru_time: # 呢個更舊?
lru = k # 記住佢
lru_time = access_time # 記住佢嘅時間
del self.keys[lru] # 由 keys dict 移走
if lru in self.key_access: # 順手清 access record
del self.key_access[lru] # 清 access record
return lru # 返畀 caller 知道踢咗邊個
def store_key(self, timestamp, key): # 客人入住,滿咗就踢走最舊再入(覆蓋 L2 版)
node = self._route(key) # 搵最近嘅舖
if node is None: # 條街冇舖
return None # 冇舖就冇得 store
capacity = self.capacities.get(node, -1) # 攞 capacity,冇設過 → -1(無限)
already_here = (self.keys.get(key) == node) # 同一個 key 之前已經住喺度?
used = 0 # 計住而家幾多個
for n in self.keys.values(): # 數呢間 node 嘅客人
if n == node: # 住喺目標 node?
used += 1 # 多一個客
if capacity != -1 and not already_here and used >= capacity: # 三個條件都中先要踢人
self.evict_lru(timestamp, node) # 滿咗 → 踢走最舊客人
self.keys[key] = node # 記低客人住喺呢間舖
self.key_access[key] = timestamp # refresh access time(同一條 key 再 store 都會 refresh)
return node # 返 node
def __init__(self):
self.nodes = {}
self.keys = {}
self.key_access = {}
self.capacities = {} # L4 加:node → max keys
self.node_locks = defaultdict(asyncio.Lock)
self.nodes = { 舖頭名冊
"node_a": {"replicas": 1, "positions": [20]},
"node_b": {"replicas": 1, "positions": [60]},
}
self.keys = { 客人住邊間
"k1": "node_a",
"k2": "node_a", node_a 住咗 2 個客
"k3": "node_b",
}
self.key_access = { 客人最後出現時間
"k1": 2, k1 最舊 → 最先被踢
"k2": 6,
"k3": 8,
}
self.capacities = { 舖頭最多住幾多個客(L4 新加)
"node_a": 2, node_a 最多 2 個,而家住咗 2 個 → 滿
"node_b": 5, node_b 最多 5 個
} 冇設過嘅 → .get(node, -1) → 無限
store_key(L2 寫嘅)要改:
加 capacity check + 滿咗就 evict LRU
re-store 同一條 key 要 refresh access time
remove_node(L1 寫嘅)要改:
順手 del self.capacities[node_id]
_route(key) store_key 用嚟搵舖
# evict logic inline 寫,唔再抽 helper
node = 舖頭 key = 客人 同時處理多個客人嘅操作
batch_operations(timestamp, operations) processes a list of operations concurrently. // 同時處理多個操作
Each operation is a dict with a "type" key: // 每個 op 有 type
{"type": "store", "key": "..."} — store the key, return the node_id or None // store → return node 或 None
{"type": "route", "key": "..."} — route the key, return the node_id or None // route → return node 或 None
{"type": "remove_key", "key": "..."} — remove the key from storage, return True/False // 刪 key → True/False
Operations on the same key must be serialized using a lock per key. // 同一條 key 要 lock
Use asyncio.gather for concurrency. Return results in input order. // gather 同時跑,順序對應 input
async def batch_operations(self, timestamp, operations): # 同時處理多個 op,lock per key
async def execute(op): # 內部 helper:處理一個 op
key = op["key"] # 攞 op 嘅 key
async with self.key_locks[key]: # 鎖住呢條 key(同一條 key 嘅 op 排隊行)
if op["type"] == "store": # dispatch 去返之前嘅 method
return self.store_key(timestamp, key) # call store_key
elif op["type"] == "route": # 係 route 操作?
return self._route(key) # route 唔記錄
elif op["type"] == "remove_key": # 冇獨立 remove_key method,inline 寫
if key in self.keys: # key 存在?
del self.keys[key] # 清 routing
if key in self.key_access: # 有 access record?
del self.key_access[key] # 清 access record
return True # 呢度返 True,代表今次動作真係成功落地
return False # key 唔存在
return None # unknown type
tasks = [] # 收集所有 coroutine
for op in operations:
tasks.append(execute(op))
results = await asyncio.gather(*tasks) # 全部 op 一齊跑(受 lock 約束)
return list(results) # gather 返嚟嗰批結果轉成普通 list;caller 之後比較易直接用
def __init__(self):
self.nodes = {}
self.keys = {}
self.key_access = {}
self.capacities = {}
self.key_locks = defaultdict(asyncio.Lock) # L5 加
冇加新 user data field
多咗 self.key_locks(concurrency 用)
冇新 helper
直接 call 返 store_key / _route,remove inline 寫
node = 舖頭 sync = 將客人資料抄去另一間舖 fail-fast = 舖唔存在就唔做
sync_replicas(timestamp, requests, max_concurrent) simulates syncing data to external replicas. // 模擬 sync 去外部
Each request is a dict: {"source_node": "...", "dest_node": "..."}. // 每個 request 有 source 同 dest
Use a semaphore to limit concurrent syncs to max_concurrent. // sem 限制同時幾多個
Simulate each sync with await asyncio.sleep(0.01). // sleep 模擬 API call
If either the source or destination node does not exist, return False immediately // 任一 node 唔存在 → False
without acquiring the semaphore and without sleeping. // 唔入 sem,唔 sleep(fail-fast)
If both nodes exist, acquire the semaphore, sleep, and return True. // 兩個都存在 → sem + sleep + True
Return a list of booleans in the same order as the input. // return list,順序對應 input
# Fail-fast pattern:先 check,唔合資格嘅唔入 sem 唔 sleep
async def sync_replicas(self, timestamp, requests, max_concurrent): # 模擬將客人 sync 去其他舖
sem = asyncio.Semaphore(max_concurrent) # 准考證 N 張,限制同時 sync
async def do_sync(req): # 處理一個 sync request
source = req["source_node"] # 攞 source 舖名
dest = req["dest_node"] # 攞 dest 舖名
if source not in self.nodes: # Check 1:source 舖存在?
return False # 唔存在 → 即走(fail-fast,唔 sleep 唔入 sem)
if dest not in self.nodes: # Check 2:dest 舖存在?
return False # 唔存在 → 即走
async with sem: # 兩個 check 都過 → 入 sem(排住做)
await asyncio.sleep(0.01) # sleep 模擬 sync 嘅時間
return True # sync 完成
tasks = [] # 收集所有 coroutine
for req in requests: # 逐個 request 行
tasks.append(do_sync(req)) # 加入 task list
results = await asyncio.gather(*tasks) # 全部 request 一齊跑
return list(results) # 結果順序對應 input
def __init__(self):
self.nodes = {}
self.keys = {}
self.key_access = {}
self.capacities = {}
self.key_locks = defaultdict(asyncio.Lock)
冇加新 field
只係多咗 async method 接受外部 request list
冇新 helper
fail-fast check inline 寫,唔抽 helper
── Helpers ── 🟰 _track() 同 Hashring _ring() 一樣 🟰 _route() 同 Hashring _route() 一樣 🟰 _reassign() 同 Hashring _reassign() 一樣 ⚠️ _total_size() Hashring 冇(Hashring 數人頭,ChatRoute 數 MB) ── L1 Server Management ── 🟰 add_server 同 Hashring add_node 一樣 🟰 remove_server 同 Hashring remove_node 一樣 🟰 route_request 同 Hashring route_key 一樣 🟰 get_server_count 同 Hashring get_node_count 一樣 ── L2 Request Tracking ── ⚠️ assign_request 多咗 size_mb param(Hashring store_key 冇 size) 🟰 list_servers 同 Hashring list_nodes 一樣 ⚠️ get_server_load Hashring 數人頭,ChatRoute 數總 MB ⚠️ top_servers 排 MB 唔係排 count(sort key 一樣,數值唔同) ── L3 Virtual Nodes ── 🟰 add_server_with_replicas 同 Hashring 完全一樣 🟰 get_replica_count 同 Hashring 完全一樣 ── L4 Memory Eviction ── ⚠️ _evict_lru_from 清 3 個 dict(多咗 request_sizes) ⚠️ assign_request L4 版:加 while loop eviction ⚠️ set_memory_limit 對應 Hashring set_capacity(MB 唔係 count) 🟰 get_memory_limit 同 Hashring get_capacity 一樣(-1/0 rule) ⚠️ get_memory_used 用 _total_size(MB)唔係 get_used(count) ⚠️ evict_oldest_session 對應 Hashring evict_lru ⚠️ remove_server L4 版:多一行 del memory_limits ── L5 Batch ── 🟰 batch_requests 同 Hashring L5 一樣 pattern(lock per request_id) ── L6 Replication ── ⚠️ replicate_sessions 多一個 check:source memory_used >= bandwidth_mb
根據 Evening-Warthog7048:"consistent hashing, chat request routing to virtual nodes and RAM/GPU memory based eviction"。最大 variance 喺 L4:每個 request 有 size,eviction 係 size-based 而唔係 count-based。
用字對照(spec 改咗),但 pattern 同 Hashring:
真題 → Hashring mock
server → node
chat request → key
circular track → hash ring
compute_position → _hash
真正 variance(要小心改 code):
L2 assign_request 多咗 size_mb param
L2 load = sum of sizes(唔係 count)
L4 capacity = MB(唔係 count)
L4 eviction = 踢到夠位放新 request 為止(可能踢走多個)
L6 多一個 check:bandwidth_mb <= source 嘅 memory_used
circular track = 一條圓形嘅街,100 格(0-99)
server = 舖頭(固定喺某格)
request = 客人(企喺某格,要搵最近嘅舖頭)
compute_position() = 一部地址分配機(黑盒)
你畀個名入去 → 佢吐個 0-99 嘅數字出嚟 → 呢個就係地址
你控制唔到出咩數字,但同一個名永遠出同一個數字
考試會畀呢個 function 你,照用
ChatRoute 同 Hashring 嘅分別:
Hashring 每個 key 冇 size → 數人頭
ChatRoute 每個 request 有 size_mb → 數 MB
eviction:Hashring 數人頭超 capacity 踢一個
ChatRoute 數 MB 超 limit 可能踢幾個
import hashlib # 用嚟整 md5 hash function
import asyncio # L5/L6 嘅 async + Lock + Semaphore 要用
from collections import defaultdict # dict 預設值用,慳幾行 if-not-in 嘅 code
def compute_position(s): # 條 spec 已經畀咗呢個 function,直接抄
return int(hashlib.md5(s.encode()).hexdigest(), 16) % 100 # 任何字串 → 0-99 嘅位置
class ChatRouter:
def __init__(self):
self.servers = {} # server_name → {replicas, positions},記低每間舖喺條街邊度
self.requests = {} # request_id → server_name,每個客而家擺喺邊間舖
self.request_access = {} # request_id → timestamp,最後 access 時間(LRU eviction 用)
self.request_sizes = {} # request_id → size_mb(L2 加,size-based variance 嘅靈魂)
self.memory_limits = {} # server_name → max_mb(L4 加,每間舖嘅 RAM 上限)
self.request_locks = defaultdict(asyncio.Lock) # L5 用,同一個 request id 嘅 ops 唔可以同時跑
self.server_locks = defaultdict(asyncio.Lock) # L6 用,鎖 source server 嗰陣 check(避免 race)
self.servers 嘅樣:
{
"srv_a": {"replicas": 1, "positions": [42]},
"srv_b": {"replicas": 3, "positions": [60, 35, 88]},
}
self.requests 嘅樣:
{"r1": "srv_a", "r2": "srv_b", "r3": "srv_a"}
self.request_sizes 嘅樣:
{"r1": 30, "r2": 50, "r3": 20} ← 每個 request 帶幾多 MB
self.request_access 嘅樣:
{"r1": 10, "r2": 20} ← 最後 access 嘅 timestamp
self.memory_limits 嘅樣:
{"srv_a": 100, "srv_b": 200} ← 最多放幾多 MB
# Helper 1: _track — 砌成條圓形街嘅地圖(按位置排序好嘅 list)
# 🟰 同 Hashring _ring() 一樣,改名就用
def _track(self): # 砌返成條街嘅地圖(sorted list)
items = [] # 開個空 list 等住裝 (pos, name) tuple
for name, info in self.servers.items(): # 逐間舖揭出嚟睇
for pos in info["positions"]: # 一間舖可能佔多個位置(L3 virtual nodes)
items.append((pos, name)) # 每個位置都係條街上一個 marker
items.sort() # 按位置升序排,等陣 _route() 順時針搵就方便
return items # 例如 [(15,"a"),(42,"b"),(78,"a")]
# Helper 2: _route — 由客人嘅位置順時針搵第一間舖
# 🟰 同 Hashring _route() 一樣,改名就用
def _route(self, request_id): # 順時針搵最近嘅舖
track = self._track() # 即時拎條街最新狀態
if not track: # 條街上一間舖都冇
return None # 冇得 route,return None
pos = compute_position(request_id) # 條 spec 規定:客人都用同一條 hash 計位置
for track_pos, name in track: # 順時針逐個 marker 行
if track_pos >= pos: # 第一間 ≥ 客人位置嘅就係答案
return name # 返呢間舖嘅名
return track[0][1] # 行到最尾都冇 → wrap around 返第一間
# Helper 3: _reassign — 條街變咗,所有客人重新搵舖
# 🟰 同 Hashring _reassign() 一樣,改名就用
def _reassign(self): # 所有 request 重新派
new_requests = {} # 開新 dict,安全過 in-place 改
for req_id in list(self.requests): # 行返每個現有客人(list() 包一層避免 mutation 出事)
server = self._route(req_id) # 用最新 _track() 計返佢應該去邊
if server is not None: # 仲有舖喺度先派
new_requests[req_id] = server # 入新 dict
self.requests = new_requests # 一次過 swap 過去,所有 request 都派好咗
_track()
return [(pos, server_name), ...] sorted by pos
每個 server 嘅 N 個 positions 全部 flatten 入嚟
_route(request_id)
用 compute_position(request_id) 計位置
順時針搵第一間 track_pos >= pos 嘅舖
搵唔到就 wrap 返 track[0]
_reassign()
add_server / remove_server 之後 call
所有現有 request 都用最新嘅街重新 route
server = 舖頭 request = 客人 track = 圓形街 position = 地址 route = 搵最近嘅舖
You are building a chat request routing system. Your system manages a set of servers arranged on a circular track with positions numbered 0 to 99. A hash function compute_position(name) is provided — it takes any string and returns a position from 0 to 99. // 一條圓形街 0-99,hash function 畀你
add_server(timestamp, server_name) registers a new server on the track at position compute_position(server_name). If the server already exists, return False. Otherwise return True. // 開舖,已有→False
remove_server(timestamp, server_name) removes a server from the track. Any chat requests currently assigned to this server should be automatically reassigned to the next available server in clockwise order. Returns False if server doesn't exist. // 執舖,客人自動搬去下一間
route_request(timestamp, request_id) determines which server should handle this chat request. Compute the request's position using compute_position(request_id), then find the first server whose position is greater than or equal to the request's position, moving clockwise. If the request's position exceeds all server positions, wrap around to the server with the smallest position. Return None if no servers exist. // 順時針搵最近嘅 server
get_server_count(timestamp) returns the total number of servers. // 幾多個 server
def add_server(self, timestamp, server_name): # 開新舖入條街
if server_name in self.servers: # 同名舖已存在
return False # spec 規定:return False
self.servers[server_name] = { # 登記呢間舖嘅資料
"replicas": 1, # L1 一間舖只佔一個位置
"positions": [compute_position(server_name)] # 用 server name hash 出嚟條街上嗰個位
}
self._reassign() # 條街多咗間舖,所有客重新派
return True # 成功 → return True
# 🟰 同 Hashring remove_node 一樣
def remove_server(self, timestamp, server_name): # 執舖
if server_name not in self.servers: # 根本冇呢間舖
return False # spec 規定 return False
del self.servers[server_name] # 由 servers dict 拆走
self._reassign() # 條街少咗一間,原本嗰啲客自動順時針搬去下一間
return True # 成功就返 True;caller 可以當今次動作真係做咗
# 🟰 同 Hashring route_key 一樣
def route_request(self, timestamp, request_id): # 純查詢:request 應該去邊間
return self._route(request_id) # 直接借 helper 答
# 🟰 同 Hashring get_node_count 一樣
def get_server_count(self, timestamp): # 數而家有幾多間舖
return len(self.servers) # dict 嘅 key 數就係答案
def __init__(self):
self.servers = {} 舖頭名冊:邊間舖喺條街邊度
self.requests = {} 客人住邊間:客人名 → 舖頭名
self.request_sizes = {} 客人行李:客人名 → 幾多 MB
self.request_access = {} 客人最後出現:客人名 → timestamp(踢人用)
self.memory_limits = {} 舖頭行李上限:舖頭名 → 最多幾多 MB(L4 加)
self.request_locks = defaultdict(asyncio.Lock) L5 用
self.server_locks = defaultdict(asyncio.Lock) L6 用
self.servers = { 舖頭名冊
"srv_a": {
"replicas": 1,
"positions": [42],
},
"srv_b": {
"replicas": 1,
"positions": [78],
},
}
self.requests = {} 客人住邊間(L1 未 assign,空)
self.request_sizes = {} 客人行李(L2 先用)
self.request_access = {} 客人最後出現(L4 eviction 先用)
self.memory_limits = {} 舖頭行李上限(L4 先加)
self.request_locks = defaultdict(asyncio.Lock) L5 先用
self.server_locks = defaultdict(asyncio.Lock) L6 先用
compute_position(s)
spec 畀,任何字串 → 0-99
_track()
砌條街 [(pos, name), ...],sorted by pos
_route(request_id)
順時針搵第一間 ≥ 客人位置嘅舖
_reassign()
servers 變動之後,全部 request 重新派
Each chat request now carries a memory footprint (size_mb). // 每個 request 有 size
assign_request(timestamp, request_id, size_mb) routes and records the assignment WITH the size. // Hashring 嘅 store_key 冇 size param!
If the same request is assigned again, update routing and refresh access time, BUT keep original size. // re-assign 唔改 size
Returns server name or None.
list_servers(timestamp) returns all server names sorted alphabetically. // 同 Hashring
get_server_load(timestamp, server_name) returns the total megabytes of all requests on this server. // 唔係 count!係 sum of sizes
Returns 0 if server doesn't exist.
top_servers(timestamp, n) ranked by total memory desc, tie by name asc. Format: "server_name(total_mb)". // 排總 MB 唔係 count
def _total_size(self, server_name): # 數某間舖嘅總 MB
total = 0 # 累加器
for req_id, srv in self.requests.items(): # 行晒所有 request
if srv == server_name: # 住喺目標 server
total = total + self.request_sizes[req_id] # 加埋佢嘅 size_mb
return total # 答總 MB
# ── L2 methods ──
# ⚠️ 多咗 size_mb(Hashring store_key 冇 size)
def assign_request(self, timestamp, request_id, size_mb): # 派 request 入舖 + 記低 size
server = self._route(request_id) # 順時針搵舖
if server is None: # 冇舖開?
return None # 冇舖就答 None
if request_id not in self.request_sizes: # 第一次見先記 size
self.request_sizes[request_id] = size_mb # 記低 size
self.requests[request_id] = server # 記低住邊間
self.request_access[request_id] = timestamp # 更新 access time
return server # 答返 server 名
# 🟰 同 Hashring list_nodes 一樣
def list_servers(self, timestamp): # 列晒所有舖名,按字母升序
return sorted(self.servers.keys()) # 簡單一句搞掂
# ⚠️ Hashring get_load 數人頭,ChatRoute 數總 MB
def get_server_load(self, timestamp, server_name): # 返呢間舖嘅總 MB
if server_name not in self.servers: # 舖根本唔存在
return 0 # spec 規定 return 0
return self._total_size(server_name) # 借 helper 加埋全部 size_mb
# ⚠️ Hashring top_loaded 排 count,ChatRoute 排 MB(sort key 一樣,數值唔同)
def top_servers(self, timestamp, n): # 排頭 N 間舖,按 MB 降序
loads = {} # server_name → total_mb
for server_name in self.servers: # 每間舖都要出現喺結果(即使 0 MB)
loads[server_name] = 0 # 初始 0
for req_id, server_name in self.requests.items(): # 行每個 request
if server_name in loads: # 保險,舖可能啱啱被刪
loads[server_name] = loads[server_name] + self.request_sizes[req_id] # 加 size
items = sorted(loads.items(), key=lambda x: (-x[1], x[0])) # -size 大嘅排先,tie 用名升序
result = [] # 砌返畀 caller 嘅 list
for server_name, load in items[:n]: # 截首 N 個
result.append(f"{server_name}({load})") # 格式:"srv_a(80)"
return result # 將整理好嘅結果交返 caller;外面就拎住呢份清單或報表去用
def __init__(self):
self.servers = {} 舖頭名冊:邊間舖喺條街邊度
self.requests = {} 客人住邊間:客人名 → 舖頭名
self.request_sizes = {} 客人行李:客人名 → 幾多 MB
self.request_access = {} 客人最後出現:客人名 → timestamp(踢人用)
self.memory_limits = {} 舖頭行李上限:舖頭名 → 最多幾多 MB(L4 加)
self.request_locks = defaultdict(asyncio.Lock) L5 用
self.server_locks = defaultdict(asyncio.Lock) L6 用
self.servers = { 舖頭名冊
"srv_a": {
"replicas": 1,
"positions": [42],
},
"srv_b": {
"replicas": 1,
"positions": [78],
},
}
self.requests = { 客人住邊間(L2 開始 assign)
"r1": "srv_a",
"r2": "srv_a",
"r3": "srv_b",
}
self.request_sizes = { 客人行李(L2 新加,每個 request 帶幾多 MB)
"r1": 30,
"r2": 50,
"r3": 20,
}
self.request_access = { 客人最後出現(assign 時記 timestamp)
"r1": 1,
"r2": 2,
"r3": 3,
}
self.memory_limits = {} 舖頭行李上限(L4 先加)
self.request_locks = defaultdict(asyncio.Lock) L5 先用
self.server_locks = defaultdict(asyncio.Lock) L6 先用
_route(request_id)
assign_request 開頭 call 一次搵舖
def _total_size(self, server_name):
total = 0 開個 counter
for req_id, srv in self.requests.items(): 行晒所有 request
if srv == server_name: 呢個 request 係住喺目標 server
total = total + self.request_sizes[req_id] 加埋佢嘅 MB
return total 答總 MB
用喺:get_server_load / top_servers / L4 eviction check
def _evict_lru_from(self, server_name):
candidates = [] 開個 list 裝呢間 server 嘅 request
for req_id, srv in self.requests.items(): 行晒所有 request
if srv == server_name: 住喺呢間 server
candidates.append(req_id) 入 list
if not candidates: 間 server 一個 request 都冇
return None 冇得踢
lru = None 等陣記住最舊嗰個
lru_time = None
for req_id in candidates: 逐個 candidate 比較 access time
access_time = self.request_access[req_id]
if lru is None or access_time < lru_time: 搵到更舊嘅
lru = req_id 記住佢
lru_time = access_time
del self.requests[lru] 由 routing dict 移走
if lru in self.request_sizes: 由 size dict 移走
del self.request_sizes[lru]
if lru in self.request_access: 由 access dict 移走
del self.request_access[lru]
return lru 答返被踢嗰個 request_id
重點:踢一個要清 3 個 dict(requests + sizes + access)
L4 嘅 while loop 會重複 call 呢個直到夠位
現實比喻 ChatRoute Hashring
───────── ────────── ──────────
條街 circular track hash ring
舖頭 server node
客人 request key
客人嘅行李 size_mb (冇,Hashring 只數人頭)
舖頭地址 position position
舖頭最多放幾多行李 memory_limit capacity
踢走最耐冇嚟嘅客 evict LRU evict LRU
分店 replica replica
准考證 semaphore semaphore
鎖 lock lock
一句講晒:條街有幾間舖,客人行入嚟就順時針搵最近嘅舖住。每個客人帶住行李(MB)。舖頭有行李上限,滿咗就踢走最耐冇出現嘅客人。
To improve load distribution, each server can register multiple positions on the track. // 一個 server 可以佔多個位
add_server_with_replicas(timestamp, server_name, num_replicas) registers a server at num_replicas positions. Position i is computed as compute_position(f"{server_name}_{i}"). Returns False if server already exists. // = add_node_with_replicas
After adding or removing replicas, active requests are automatically rebalanced. // 自動 reassign
def add_server_with_replicas(self, timestamp, server_name, num_replicas): # 開舖 + N 間分店
if server_name in self.servers: # 同名舖已存在
return False # return False
positions = [] # 等住裝 N 個位置
for i in range(num_replicas): # 0, 1, 2, ... num_replicas-1
positions.append(compute_position(f"{server_name}_{i}")) # hash "srv_c_0", "srv_c_1", ...
self.servers[server_name] = { # 登記呢間舖
"replicas": num_replicas, # 記低用幾多個 replica(get_replica_count 要用)
"positions": positions # N 個位置一齊塞落 list
}
self._reassign() # 條街多咗 N 個 marker,所有現有客重新派
return True # 成功就返 True;caller 可以當今次動作真係做咗
# 🟰 同 Hashring get_replica_count 一樣
def get_replica_count(self, timestamp, server_name): # 呢間舖幾多間分店
if server_name not in self.servers: # 舖唔存在
return 0 # spec 規定 return 0
return self.servers[server_name]["replicas"] # 直接 dict lookup
def __init__(self):
self.servers = {}
self.requests = {}
self.request_access = {}
self.request_sizes = {}
self.memory_limits = {}
self.request_locks = defaultdict(asyncio.Lock)
self.server_locks = defaultdict(asyncio.Lock)
self.servers = { 舖頭名冊(positions 可以有多個)
"srv_a": {
"replicas": 1,
"positions": [42],
},
"srv_c": {
"replicas": 3, L3 新嘢:一間舖可以佔多個位
"positions": [15, 42, 78],
},
}
self.requests = { 客人住邊間
"r1": "srv_a",
"r2": "srv_c",
"r3": "srv_c",
}
self.request_sizes = { 客人行李
"r1": 30,
"r2": 50,
"r3": 20,
}
self.request_access = { 客人最後出現
"r1": 1,
"r2": 2,
"r3": 3,
}
self.memory_limits = {} 舖頭行李上限(L4 先加)
self.request_locks = defaultdict(asyncio.Lock) L5 先用
self.server_locks = defaultdict(asyncio.Lock) L6 先用
舊嘅唔改,_track / _route / _reassign 自動 handle N 個 positions
只係 add 2 個新 method:
add_server_with_replicas(timestamp, server_name, num_replicas)
get_replica_count(timestamp, server_name)
compute_position(f"{server_name}_{i}")
每個 replica 嘅位置用 server_name + index hash
_reassign()
add 完即刻 call,所有現有 request 重新派
Each server has a RAM limit in megabytes. // MB 唔係 count!
When assigning would exceed the limit, keep evicting LRU until there's room. // 可能要踢多個!
set_memory_limit(timestamp, server_name, max_mb) sets RAM limit in MB. Default unlimited. // MB 上限
get_memory_limit(timestamp, server_name) returns limit. -1 = unlimited. 0 if not exist.
get_memory_used(timestamp, server_name) returns total MB used. Returns 0 if not exist. // sum of sizes,唔係 count
evict_oldest_session(timestamp, server_name) manually evicts LRU. Returns evicted request_id or None.
assign_request behavior: if (memory_used + new_size > limit) and request not already there, keep evicting LRU until enough room. 一個 100MB 嘅 request 入嚟可能要踢走 5 個 20MB 嘅。Re-assigning refreshes access time only.
L4 你應該點樣落手:
memory_limits 呢本簿,因為成個 L4 新規矩就係「每間舖有 MB 上限」。set_memory_limit、get_memory_limit、get_memory_used。_evict_lru_from。evict_oldest_session,等外面可以手動踢一個。assign_request,因為 while eviction 要靠前面啲 helper 先寫得順。remove_server,執舖時連 memory_limits 一齊清走。Step 1:先加 L4 新 state
def __init__(self):
...
self.memory_limits = {} # server_name -> 最多幾多 MB(冇設 = 無限)
你個腦要記:Hashring L4 係 capacities,ChatRoute L4 只係將「最多幾個客」改成「最多幾多 MB 行李」。
Step 2:先補 setter / getter / used API
def set_memory_limit(self, timestamp, server_name, max_mb):
if server_name not in self.servers:
return False
self.memory_limits[server_name] = max_mb
return True
def get_memory_limit(self, timestamp, server_name):
if server_name not in self.servers:
return 0
return self.memory_limits.get(server_name, -1)
def get_memory_used(self, timestamp, server_name):
if server_name not in self.servers:
return 0
return self._total_size(server_name)
點解先寫呢幾個?因為佢哋最直接,亦順手幫你固定晒 L4 嗰套 rule:0 = server 唔存在,-1 = limit 未設,used 係借 L2 已有嘅 _total_size。
Step 3:寫真正會踢人嘅 helper
def _evict_lru_from(self, server_name):
candidates = []
for req_id, srv in self.requests.items():
if srv == server_name:
candidates.append(req_id)
if not candidates:
return None
lru = None
lru_time = None
for req_id in candidates:
access_time = self.request_access[req_id]
if lru is None or access_time < lru_time:
lru = req_id
lru_time = access_time
del self.requests[lru]
if lru in self.request_sizes:
del self.request_sizes[lru]
if lru in self.request_access:
del self.request_access[lru]
return lru
呢步先係 L4 核心增量。Hashring 踢人清 2 個 dict;ChatRoute 因為多咗 request_sizes,所以要清 3 個 dict。
Step 4:包一個手動踢人入口
def evict_oldest_session(self, timestamp, server_name):
if server_name not in self.servers:
return None
return self._evict_lru_from(server_name)
呢個 method 本身冇新 logic,純粹將 helper 變成 public API,俾 caller 可以直接叫系統「而家手動踢一個最舊 request 出去」。
Step 5:最後先改 assign_request
def assign_request(self, timestamp, request_id, size_mb):
server = self._route(request_id)
if server is None:
return None
limit = self.memory_limits.get(server, -1)
already_here = (self.requests.get(request_id) == server)
if (limit != -1) and (not already_here):
current_used = self._total_size(server)
while current_used + size_mb > limit:
evicted = self._evict_lru_from(server)
if evicted is None:
break
current_used = self._total_size(server)
if request_id not in self.request_sizes:
self.request_sizes[request_id] = size_mb
self.requests[request_id] = server
self.request_access[request_id] = timestamp
return server
呢度要特別記三件事:一,呢題係 while 唔係 if;二,大貨入場可能要連踢幾個;三,同一個 request 重派返同一間舖,already_here 要 skip eviction。
Step 6:收尾改 remove_server
def remove_server(self, timestamp, server_name):
if server_name not in self.servers:
return False
del self.servers[server_name]
if server_name in self.memory_limits:
del self.memory_limits[server_name]
self._reassign()
return True
呢一步係最尾先補嘅 cleanup。唔清 memory_limits,下次同名舖重開就會食返舊 limit。
def __init__(self):
self.servers = {}
self.requests = {}
self.request_access = {}
self.request_sizes = {}
self.memory_limits = {}
self.request_locks = defaultdict(asyncio.Lock)
self.server_locks = defaultdict(asyncio.Lock)
self.servers = { 舖頭名冊
"srv_a": {
"replicas": 1,
"positions": [42],
},
}
self.requests = { 客人住邊間
"r1": "srv_a",
"r2": "srv_a",
}
self.request_sizes = { 客人行李
"r1": 30,
"r2": 50,
}
self.request_access = { 客人最後出現(eviction 用:最細 = 最舊 = 先踢)
"r1": 1,
"r2": 2,
}
self.memory_limits = { 舖頭行李上限(L4 新加)
"srv_a": 50, srv_a 最多放 50 MB
}
self.request_locks = defaultdict(asyncio.Lock) L5 先用
self.server_locks = defaultdict(asyncio.Lock) L6 先用
改:
remove_server ← 加 del self.memory_limits[server_name]
assign_request ← 其實 L2 已經預埋 while loop,L4 唔使再改
加:
set_memory_limit(timestamp, server_name, max_mb)
get_memory_limit(timestamp, server_name)
get_memory_used(timestamp, server_name)
evict_oldest_session(timestamp, server_name)
_total_size(server_name)
get_memory_used 直接借
_evict_lru_from(server_name)
evict_oldest_session 同 assign_request 嘅 while 都用
batch_requests(timestamp, operations) processes multiple ops concurrently. Each op:
{"type": "assign", "request_id": "...", "size_mb": N} // assign 多咗 size_mb!
{"type": "route", "request_id": "..."} — same as Hashring
{"type": "remove", "request_id": "..."} — return True/False
Lock per request_id. asyncio.gather. Return list in input order.
# 🟰 Pattern 同 Hashring L5 一樣(lock per key + gather)
async def batch_requests(self, timestamp, operations): # 並發跑一堆 ops,每條 request 一條 lock
async def execute(op): # inner function,每個 op 做啲咩
request_id = op["request_id"] # 攞 request_id(lock 嘅 key)
async with self.request_locks[request_id]: # 同一條 request 嘅 ops 排隊行
if op["type"] == "assign": # 第一種:派 request 入舖
size_mb = op["size_mb"] # ⚠️ Hashring 冇呢行(Hashring 冇 size)
return self.assign_request(timestamp, request_id, size_mb) # 借 L2 嘅 method
elif op["type"] == "route": # 第二種:純查詢應該去邊間舖
return self._route(request_id) # 直接借 helper
elif op["type"] == "remove": # 第三種:刪走 request
if request_id in self.requests: # 真係存在先刪
del self.requests[request_id] # 拆走主 dict
if request_id in self.request_sizes: # ⚠️ Hashring 冇呢個 dict
del self.request_sizes[request_id] # 清 size,避免下次重用同 id 中招
if request_id in self.request_access: # access timestamp 都要清
del self.request_access[request_id] # 清 access record
return True # spec 規定:成功刪 → True
return False # 唔存在 → False
return None # type 唔識認
tasks = [] # 收集所有 coroutine
for op in operations:
tasks.append(execute(op))
results = await asyncio.gather(*tasks) # 全部 op 並發開跑
return list(results) # 順序 = input 順序(gather 保證)
def __init__(self):
self.servers = {}
self.requests = {}
self.request_access = {}
self.request_sizes = {}
self.memory_limits = {}
self.request_locks = defaultdict(asyncio.Lock)
self.server_locks = defaultdict(asyncio.Lock)
self.servers = { 舖頭名冊
"srv_a": {
"replicas": 1,
"positions": [42],
},
}
self.requests = { 客人住邊間
"r1": "srv_a",
"r2": "srv_a",
}
self.request_sizes = { 客人行李
"r1": 50,
"r2": 30,
}
self.request_access = { 客人最後出現
"r1": 1,
"r2": 2,
}
self.memory_limits = { 舖頭行李上限
"srv_a": 100,
}
self.request_locks = defaultdict(asyncio.Lock) L5 新用:同一 request 排隊
self.server_locks = defaultdict(asyncio.Lock) L6 先用
operations = [ batch 入面嘅 op 格式
{"type": "assign", "request_id": "r1", "size_mb": 50},
{"type": "assign", "request_id": "r2", "size_mb": 30},
{"type": "route", "request_id": "r3"},
{"type": "remove", "request_id": "r1"},
]
舊嘅唔改,加 1 個新 method:
async batch_requests(timestamp, operations)
內部 call 返 L2 嘅 assign_request、L1 嘅 _route
⚠️ assign op 要攞埋 size_mb 傳第三個 param
⚠️ remove op 要清三個 dict(Hashring 只清兩個)
self.request_locks[request_id]
defaultdict(asyncio.Lock),同一條 request 排隊
assign_request / _route
Inner function call 返現成 method
asyncio.gather(*coros)
並發跑晒,結果順序 = input 順序
replicate_sessions(timestamp, transfers, max_concurrent) syncs data across regions.
Each transfer: {"source": "...", "destination": "...", "bandwidth_mb": N} // 多咗 bandwidth_mb
Use Semaphore(max_concurrent). Sleep 0.01 to simulate.
Before acquiring semaphore, check TWO conditions: // fail-fast 嘅條件多咗一個
1. Both source and destination servers exist
2. get_memory_used(source) >= bandwidth_mb // source 要有夠 data sync
If either check fails → return False immediately (NO sem, NO sleep — fail-fast).
If both pass → async with sem → sleep → return True.
# Fail-fast pattern + bandwidth check(3 個 check 都過先入 sem + sleep)
# ⚠️ 多一個 check:source 嘅 _total_size >= bandwidth_mb(Hashring 冇呢個)
async def replicate_sessions(self, timestamp, transfers, max_concurrent): # 跨區 sync(fail-fast + bandwidth check)
sem = asyncio.Semaphore(max_concurrent) # 限制同時最多幾多個 transfer 真正 sleep
async def do_transfer(transfer): # inner function,每個 transfer 點處理
source = transfer["source"] # 由邊間舖出
destination = transfer["destination"] # 去邊間舖
bandwidth_mb = transfer["bandwidth_mb"] # ⚠️ Hashring 冇呢個 field
async with self.server_locks[source]: # 鎖住 source 嗰陣 check(避免 race,例如同時 evict)
if source not in self.servers: # check 1:source 舖存在
return False # 唔存在 → fail-fast(唔入 sem 唔 sleep)
if destination not in self.servers: # check 2:dest 舖存在
return False # 唔存在 → fail-fast
if self._total_size(source) < bandwidth_mb: # ⚠️ check 3:source 有夠 data sync
return False # 唔夠 → fail-fast(Hashring 冇呢個 check)
async with sem: # 三個 check 全過先入 sem(限制並發)
await asyncio.sleep(0.01) # 模擬 sync 嘅 IO time
return True # 成功就返 True;caller 可以當今次動作真係做咗
tasks = []
for t in transfers:
tasks.append(do_transfer(t))
results = await asyncio.gather(*tasks)
return list(results) # 順序 = input 順序
def __init__(self):
self.servers = {}
self.requests = {}
self.request_access = {}
self.request_sizes = {}
self.memory_limits = {}
self.request_locks = defaultdict(asyncio.Lock)
self.server_locks = defaultdict(asyncio.Lock)
self.servers = { 舖頭名冊
"srv_a": {
"replicas": 1,
"positions": [42],
},
"srv_b": {
"replicas": 1,
"positions": [78],
},
}
self.requests = { 客人住邊間
"r1": "srv_a",
"r2": "srv_a",
"r3": "srv_b",
}
self.request_sizes = { 客人行李
"r1": 50,
"r2": 30,
"r3": 20,
}
self.request_access = { 客人最後出現
"r1": 1,
"r2": 2,
"r3": 3,
}
self.memory_limits = { 舖頭行李上限
"srv_a": 100,
"srv_b": 200,
}
self.request_locks = defaultdict(asyncio.Lock) L5 用:同一 request 排隊
self.server_locks = defaultdict(asyncio.Lock) L6 新用:鎖 source check
transfers = [ replicate 入面嘅 transfer 格式
{"source": "srv_a", "destination": "srv_b", "bandwidth_mb": 50},
{"source": "srv_a", "destination": "srv_b", "bandwidth_mb": 100},
{"source": "srv_c", "destination": "srv_b", "bandwidth_mb": 1},
{"source": "srv_x", "destination": "srv_b", "bandwidth_mb": 10},
]
舊嘅唔改,加 1 個新 async method:
async replicate_sessions(timestamp, transfers, max_concurrent)
⚠️ Hashring 嘅 L6 只有 2 個 fail-fast check
ChatRoute 多咗第 3 個:_total_size(source) >= bandwidth_mb
asyncio.Semaphore(max_concurrent)
限制同時行緊嘅 transfer 數
self.server_locks[source]
鎖住 source 嗰陣 check,避免另一個 coroutine 同時 evict
_total_size(source)
用嚟 check source 夠唔夠 bandwidth_mb 嘅 data
asyncio.gather(*coros)
全部並發 + 保留順序
── Helper ── 🟰 _purge_expired 同 Bank _process_cashbacks 一樣 lazy pattern ── L1 CRUD ── 🟰 add_file 同 Bank create_account(check exist → True/False) 🟰 delete_file del + True/False 🟰 get_file_size 同 Bank get_balance(return int or -1) ── L2 Sort ── 🟰 list_files 同 Bank top_spenders(sort + format string) 🟰 total_size 簡單 for loop 加埋 ── L3 TTL ── 🟰 add_file_with_ttl 同 Bank pay 類似 TTL pattern ── L4 Copy ── ⚠️ copy_file Bank 冇對應!獨有:overwrite dest + remaining TTL ── L5 Batch ── 🟰 batch_operations 同 Bank L5(但 copy 要 sorted lock 兩個 path) ── L6 Sync ── 🟰 sync_files Fail-fast + sleep(多一個 size check)
想像你寫一個簡化版 file system mock。每個 file 有 path("/foo.txt")同 size(int kb)。要寫個 class 模擬增刪改查、排序、過期、複製、async batch。
想像一個目錄:
┌──────────────────────────────────────┐
│ /foo.txt size=100 kb │
│ /bar.log size=50 kb │
│ /tmp/a.tmp size=20 kb TTL=5000ms │
│ /tmp/b.tmp size=30 kb TTL=2000ms │
└──────────────────────────────────────┘
每個 file 有:
path = 個檔案嘅 key("/foo.txt")
size_kb = 大細(int kb)
expires_at = 幾時過期(None = 永遠唔過期)
規則:
1. path 唔可以重複(add 之前要 check)
2. TTL file 過期咗等同唔存在(lazy purge)
3. copy_file 撞到 dest 已存在 → 覆蓋 dest size
# 例:上面個目錄查一啲嘢
get_file_size(t, "/foo.txt") → 100
get_file_size(t, "/zz.txt") → -1(唔存在)
list_files(t, "size") → "foo.txt(100), bar.log(50), ..."(desc by size)
list_files(t, "path") → asc by path
total_size(t) → 100 + 50 + ...
# 後面 level 加多啲嘢:
# L2 加 sort/filter(list_files, total_size)
# L3 加 TTL(add_file_with_ttl, lazy _purge_expired)
# L4 加 copy_file + backup/rollback(覆蓋 dest,傳 remaining TTL)
# L5 加 async batch_operations(per-path lock)
# L6 加 sync_files(rate-limited,semaphore)
import asyncio
import copy # L4 backup deepcopy 要用
from collections import defaultdict
class FileSystem:
def __init__(self):
self.files = {} # L1 所有 file(path → info dict)
self.backups = [] # L4 加:[(timestamp, snapshot)]
self.locks = defaultdict(asyncio.Lock) # L5 加:per-path 嘅 async lock
self.files = {
"/foo.txt": {"size_kb": 100, "expires_at": None},
"/tmp/a.tmp": {"size_kb": 20, "expires_at": 5000},
}
# 第一層 key = 個 path("/foo.txt")
# 第二層係個 dict,存呢個 file 嘅 info
L1:size_kb # 最基本
L2:(冇加新 field,只係讀 size_kb)
L3:expires_at # None = 永遠唔過期;int = 幾時過期
L4:self.backups # init 時加 list;backup/rollback + copy_file
L5:self.locks # init 時加多個 defaultdict(asyncio.Lock)
L6:(冇加新 field,semaphore 喺 method 入面開)
# Helper: _purge_expired — lazy TTL 清過期 file(每個 public method 開頭都 call)
def _purge_expired(self, timestamp): # 唔係定時 task,係 lazy 模式
expired = [] # 暫存要刪嘅 path(唔可以一邊行一邊 del dict)
for path, info in self.files.items(): # 逐個 file 睇
exp = info["expires_at"] # 攞 expires_at(可能係 None)
if exp is None: # None = 永遠唔過期
continue # 跳過唔睇
if timestamp >= exp: # 當前時間 >= expires_at → 過期
expired.append(path) # 入到要刪 list
for path in expired: # 逐個 del
del self.files[path] # 真正刪走
_purge_expired(timestamp)
行一次 self.files
凡係 expires_at 不為 None 且 timestamp >= expires_at
就由 self.files 刪走
每個 public method 第一行都 call 一次(lazy 模式)
add = 加 delete = 刪 get_file_size = 查大細 -1 = 唔存在嘅約定
def add_file(self, timestamp, path, size_kb): # 加一個新 file
self._purge_expired(timestamp) # 開頭先清過期(公定模式)
if path in self.files: # 重複 path → 拒收
return False # 約定 return False
self.files[path] = { # 開一格新 file
"size_kb": size_kb, # 記低大細
"expires_at": None, # 冇 TTL = None(L3 嗰個 method 先會 set 數字)
}
return True # 加成功
# 🟰 同 Bank 冇直接對應,但 pattern 一樣(check → del → True/False)
def delete_file(self, timestamp, path): # 刪一個 file
self._purge_expired(timestamp) # 開頭先清過期
if path not in self.files: # 唔存在(或者已經 purge 走咗)
return False # 冇得刪
del self.files[path] # 真正刪走
return True # 刪成功
def get_file_size(self, timestamp, path): # 查大細
self._purge_expired(timestamp) # 開頭先清過期(過期嘅 file 應該當唔存在)
if path not in self.files: # 唔存在
return -1 # 約定 -1(唔用 None,因為 spec 要 int)
return self.files[path]["size_kb"] # 攞返 size_kb
def __init__(self):
self.files = {}
self.files = { 所有檔案(path → info dict)
"/foo.txt": {
"size_kb": 100, 檔案大細
"expires_at": None, 永遠唔過期(L3 先會 set 數字)
},
}
self.locks = defaultdict(asyncio.Lock) per-path 鎖(L5 先加)
_purge_expired(timestamp)
L1 入面所有 method 第一行都 call
L1 自己唔會產生 expired file(add_file 一律 expires_at=None)
但係要養成習慣,方便 L3 一加 TTL 就有效
list_files = 列晒所有 file sort_by = "path" 或 "size" total_size = 加埋所有 size
def list_files(self, timestamp, sort_by): # 列晒所有 file(sort by path 或 size)
self._purge_expired(timestamp) # 開頭先清過期
items = [] # 暫存所有 (path, size_kb) tuple
for path, info in self.files.items(): # 逐個 file 攞出嚟
items.append((path, info["size_kb"])) # 砌做 tuple
if sort_by == "size": # size 模式
items.sort(key=lambda x: (-x[1], x[0])) # size desc,tie 用 path asc
else: # 預設 path 模式
items.sort(key=lambda x: x[0]) # 純 path asc
parts = [] # 砌 output 字串
for path, size in items: # 逐個轉做 "path(size)"
parts.append(path + "(" + str(size) + ")") # 砌單個 entry
return ", ".join(parts) # 用 ", " 連埋一齊
# 🟰 Bank 冇對應,但 pattern 簡單(for loop 加埋)
def total_size(self, timestamp): # 所有 file 嘅總 size
self._purge_expired(timestamp) # 開頭先清過期(過期嘅唔計)
total = 0 # 由 0 開始累加
for path, info in self.files.items(): # 逐個 file
total += info["size_kb"] # 加埋 size_kb
return total # 返總和
self.files = { 所有檔案(path → info dict)
"/foo.txt": {"size_kb": 100, "expires_at": None},
"/bar.log": {"size_kb": 50, "expires_at": None},
"/abc.txt": {"size_kb": 100, "expires_at": None},
}
self.locks = defaultdict(asyncio.Lock) per-path 鎖(L5 先加)
self.files.items() 出嚟嘅 x:
x = ("/foo.txt", {"size_kb": 100, "expires_at": None})
x = ("/bar.log", {"size_kb": 50, "expires_at": None})
x = ("/abc.txt", {"size_kb": 100, "expires_at": None})
x[0] = "/foo.txt" # path
x[1] = {"size_kb": 100, "expires_at": None} # 成個 info dict
x[1]["size_kb"] = 100 # 攞 size
items.append((path, info["size_kb"]))
→ items = [
("/foo.txt", 100),
("/bar.log", 50),
("/abc.txt", 100)
]
items.sort(key=lambda x: (-x[1], x[0]))
# # #
# size desc 同 size 按 path asc
排完:
("/abc.txt", 100)
("/foo.txt", 100)
("/bar.log", 50)
parts = ["/abc.txt(100)", "/foo.txt(100)", "/bar.log(50)"]
", ".join(parts)
→ "/abc.txt(100), /foo.txt(100), /bar.log(50)"
如果 sort_by="path":
items.sort(key=lambda x: x[0])
→ 純 path 字母升序
_purge_expired(timestamp)
list_files 同 total_size 開頭都要 call
過期 file 唔應該出現喺 list 或者 total 入面
TTL = time to live ttl_ms = 幾耐之後過期(毫秒) expires_at = 過期嘅絕對 timestamp lazy = 用嗰陣先 check
def add_file_with_ttl(self, timestamp, path, size_kb, ttl_ms): # 加 file + 設壽命
self._purge_expired(timestamp) # 開頭先清過期(可能 path 啱啱過期,可以重新加)
if path in self.files: # 同名 file 重複 → 拒收
return False # 唔好 silent overwrite
self.files[path] = { # 開新 file
"size_kb": size_kb, # 記低大細
"expires_at": timestamp + ttl_ms, # 絕對過期時間 = 而家 + 壽命
}
return True # 加成功
def __init__(self):
self.files = {}
# 仲係冇加 instance var,TTL 資訊放入 file dict 入面
self.files = { 所有檔案(path → info dict)
"/foo.txt": {
"size_kb": 100, 檔案大細
"expires_at": None, 永遠唔過期(add_file 加嘅)
},
"/tmp/a.tmp": {
"size_kb": 20, 檔案大細
"expires_at": 5000, 5000 ms 過期(add_file_with_ttl 加嘅)
},
}
self.locks = defaultdict(asyncio.Lock) per-path 鎖(L5 先加)
_purge_expired(timestamp)
L3 真正用得着佢,凡 expires_at 不為 None 且 timestamp 到位就刪
令所有 query method 自然唔見過期 file
copy_file = 將 source 嘅內容複製去 dest overwrite = dest 已存在就直接蓋過 backup = deepcopy 成個 files rollback = 搵返最近嘅 backup 還原 remaining_ttl = 仲剩幾耐先過期
def copy_file(self, timestamp, source, dest): # 將 source 嘅 size + TTL 複製去 dest
self._purge_expired(timestamp) # 開頭先清過期
if source not in self.files: # source 唔存在(或者啱啱 purge 走咗)
return False # 冇得 copy
src_info = self.files[source] # 攞 source 嘅 info dict
src_size = src_info["size_kb"] # source 嘅大細(dest 跟住用同一個 size)
src_exp = src_info["expires_at"] # source 嘅 expires_at(可能 None)
if src_exp is None: # source 永遠唔過期
new_exp = None # dest 都永遠唔過期
else: # source 有 TTL
remaining = src_exp - timestamp # 計返 source 仲剩幾耐
new_exp = timestamp + remaining # dest 喺呢一刻起,再撐 remaining ms(即係同 src_exp 等值)
self.files[dest] = { # 直接覆蓋(或者新開)dest 嗰格
"size_kb": src_size, # 抄 source size
"expires_at": new_exp, # 抄 source 剩餘 TTL(或者 None)
}
return True # copy 成功
def backup(self, timestamp): # 影張相(deepcopy 成個 files)
self._purge_expired(timestamp) # 開頭先清過期
snapshot = {} # 張相(空 dict)
for path, info in self.files.items(): # 逐個 file 行
remaining = None # 默認冇 TTL
if info["expires_at"] is not None: # 有 TTL?
remaining = info["expires_at"] - timestamp # 計仲剩幾耐
snapshot[path] = { # 影低呢個 file
"size_kb": info["size_kb"], # 存 size
"remaining_ttl": remaining, # 存仲剩幾耐(唔係絕對 expiry!)
}
self.backups.append((timestamp, snapshot)) # 存張相入 backups list
def rollback(self, timestamp): # 還原去最近嗰張 backup(timestamp 或之前)
best = None # 記住最近嗰張相
for ts, snap in self.backups: # 逐張相睇
if ts <= timestamp: # 呢張喺目標時間或之前?
if best is None or ts > best[0]: # 係最近嗰張?
best = (ts, snap) # 記住佢
if best is None: # 搵唔到任何相
return False # 冇得還原
backup_ts, snapshot = best # 解構:backup 影相嗰刻 + 張相
self.files = {} # 清空成個 file storage
for path, fd in snapshot.items(): # 逐個 file 重建
exp = None # 默認冇 TTL
if fd["remaining_ttl"] is not None: # 有 TTL?
exp = timestamp + fd["remaining_ttl"] # 重算 expiry = 而家 + 仲剩幾耐
self.files[path] = { # 放返入 files
"size_kb": fd["size_kb"], # 還原 size
"expires_at": exp, # 新嘅 expiry(重算過)
}
return True # rollback 成功
⚠️ Bank backup/restore 同一個 pattern,加埋 remaining TTL recalculation
def __init__(self):
self.files = {}
self.backups = [] # L4 加
self.files = { 所有檔案(path → info dict)
"/a.txt": {
"size_kb": 100, 檔案大細
"expires_at": 5000, 過期時間
},
"/b.txt": {
"size_kb": 100, copy 完同 source 一樣
"expires_at": 5000, copy 完同 source 一樣
},
}
self.backups = [ 備份 list(L4 加)
(100, { timestamp=100 影嘅相
"/a.txt": {
"size_kb": 100, 存 size
"remaining_ttl": 4900, 存仲剩幾耐(唔係 expires_at!)
},
}),
]
self.locks = defaultdict(asyncio.Lock) per-path 鎖(L5 先加)
backup 影相:逐個 file 計 remaining_ttl = expires_at - timestamp。存嘅係「仲剩幾耐」,唔係絕對時間。
rollback 還原:搵最近嗰張相(ts <= 目標 timestamp),清空 files,逐個重建。新 expiry = 而家 timestamp + remaining_ttl。
None TTL 嘅 file:backup 存 remaining_ttl=None,rollback 重建時 expires_at=None(永遠唔過期)。
_purge_expired(timestamp)
copy_file / backup 開頭都 call,避免影到過期 file
batch = 一拼做幾單嘢 lock = 鎖 per-path lock = 每個 path 一把鎖 sorted lock = 鎖兩個嗰陣依字母順序鎖(避免死鎖)
async def batch_operations(self, timestamp, ops): # 一次過做一堆 add/delete/copy
results = [] # 暫存每個 op 嘅 True/False 結果
for op in ops: # 順住 input 順序逐個做
op_type = op["type"] # 攞 op 類型
if op_type == "add": # add 類型
path = op["path"] # 攞 path
size_kb = op["size_kb"] # 攞 size
async with self.locks[path]: # 鎖呢個 path
ok = self.add_file(timestamp, path, size_kb) # 走返 L1 嘅 add_file
results.append(ok) # 記返結果
elif op_type == "delete": # delete 類型
path = op["path"] # 攞 path
async with self.locks[path]: # 鎖呢個 path
ok = self.delete_file(timestamp, path) # 走返 L1 嘅 delete_file
results.append(ok) # 記返結果
elif op_type == "copy": # copy 類型(涉及兩個 path)
source = op["source"] # 攞 source
dest = op["dest"] # 攞 dest
first = source # 先設定鎖嘅順序
second = dest #
if first > second: # 如果 source 字母順序大過 dest → 換次序
first = dest # 細嘅鎖先(兩條線一定鎖同一個方向 → 唔會 deadlock)
second = source #
async with self.locks[first]: # 先鎖細嘅
async with self.locks[second]: # 再鎖大嘅
ok = self.copy_file(timestamp, source, dest) # 真正 copy
results.append(ok) # 記返結果
else: # 其他 type 唔 support
results.append(False) # 一律 False
return results # 返一個同 input 一樣長嘅 list
def __init__(self):
self.files = {}
self.locks = defaultdict(asyncio.Lock)
self.files = { 所有檔案(path → info dict)
"/foo.txt": {
"size_kb": 100, 檔案大細
"expires_at": None, 永遠唔過期
},
}
self.locks = { per-path 鎖(defaultdict 一 access 就自動造)
"/foo.txt": <asyncio.Lock>, 鎖同一個 path 嘅 op 排隊
"/bar.txt": <asyncio.Lock>,
}
_purge_expired(timestamp)
間接 call(add_file / delete_file / copy_file 第一行都 call)
無額外 helper(鎖順序邏輯寫死喺 batch_operations 入面)
sync = 同步傳輸 semaphore = 信號燈(限制同時做嘅 task 數量) fail-fast = 一發現條件唔啱即刻 fail,唔等 semaphore
async def sync_files(self, timestamp, transfers, max_concurrent): # 並行做一堆 transfer,限 N 個 concurrent
self._purge_expired(timestamp) # 開頭先清過期
sem = asyncio.Semaphore(max_concurrent) # 開一個 N 位嘅 semaphore(同時最多 N 個)
tasks = [] # 暫存所有 coroutine task
for transfer in transfers: # 逐個 transfer 包做一個 task
task = self._do_one_sync(timestamp, transfer, sem) # 起 coroutine(未 await)
tasks.append(task) # 入 list
results = await asyncio.gather(*tasks) # 並發跑,等全部完,保留順序
final = [] # 轉做正常 list
for r in results: # 逐個 copy 過
final.append(r) # 入 list
return final # 返一個同 transfers 一樣長嘅 list[bool]
async def _do_one_sync(self, timestamp, transfer, sem): # 做單一 transfer(async helper)
source = transfer["source"] # 攞 source path
size_kb = transfer["size_kb"] # 攞要 transfer 嘅 size
# fail-fast:未攞 semaphore 之前已經 check(唔阻住其他 task)
if source not in self.files: # source 唔存在
return False # 即刻 False,唔 acquire semaphore
actual_size = self.files[source]["size_kb"] # 攞 source 嘅實際大細
if actual_size < size_kb: # source 唔夠大畀你 transfer
return False # 即刻 False,唔 acquire semaphore
async with sem: # 過咗 fail-fast 先攞 semaphore(限速)
await asyncio.sleep(0.01) # 模擬 transfer 嘅延遲(10ms)
return True # transfer 成功
def __init__(self):
self.files = {}
self.locks = defaultdict(asyncio.Lock)
同 L5 一樣,semaphore 喺 method 入面開(per-call)
self.files = { 所有檔案(path → info dict)
"/big.bin": {
"size_kb": 5000, 檔案大細
"expires_at": None, 永遠唔過期
},
}
self.locks = defaultdict(asyncio.Lock) per-path 鎖(同 L5)
semaphore 唔放入 self,每次 sync 重新開
_purge_expired(timestamp)
sync_files 開頭 call 一次
_do_one_sync(timestamp, transfer, sem)
本 level 自家嘅 async helper,包住 fail-fast + semaphore + sleep
好似打機排名榜。加玩家、改分數、睇排名。Score decay = 定時全員扣分。
import asyncio
import copy
from collections import defaultdict
class Leaderboard:
def __init__(self):
self.players = {} # player_id → {"score": int}
self.decay_interval = None # L3 decay 間隔
self.decay_next_due = None # L3 下次 decay 時間
self.snapshots = [] # L4 [(ts, players_copy)]
self.player_locks = defaultdict(asyncio.Lock) # L5
def __init__(self):
self.players = {}
self.decay_interval = None
self.decay_next_due = None
self.snapshots = []
self.player_locks = defaultdict(asyncio.Lock)
self.players = {
"alice": {"score": 500},
"bob": {"score": 300},
}
self.decay_interval = 1000 # 每 1000ms 扣一次
self.decay_next_due = 2000 # ts >= 2000 就要補做 decay
self.snapshots = [
(1500, {"alice": {"score": 600}}),
]
self.player_locks = {
"alice": <asyncio.Lock>,
"bob": <asyncio.Lock>,
}
def _process_decay(self, timestamp): # 全員扣分(lazy,每 interval 扣一次)
if self.decay_interval is None: # 未設定 decay → 唔做嘢直接走
return # 即刻 return
while self.decay_next_due is not None and timestamp >= self.decay_next_due: # 到期就扣,while loop 補扣漏咗嘅
for pid in self.players: # 全員扣 10%
old = self.players[pid]["score"] # 記低舊分數
self.players[pid]["score"] = old - old // 10 # 整數除法
self.decay_next_due += self.decay_interval # 推去下一次到期時間
乘法式 decay: 100→90→81(唔係每次扣10)
while loop 處理多個到期 decay
score=0 → 0//10=0 → 仲係 0
add_player(ts, pid) → score=0。已存在→False。update_score(ts, pid, delta) → clamp≥0。get_rank(ts, pid) → 1-based。def add_player(self, timestamp, player_id): # 加一個新玩家
self._process_decay(timestamp) # 先處理到期嘅 decay
if player_id in self.players: return False # 已經存在 → 唔加
self.players[player_id] = {"score": 0} # 新玩家分數由 0 開始
return True # 成功加入
def update_score(self, timestamp, player_id, delta): # 改分數(clamp >= 0)
self._process_decay(timestamp) # 先處理到期嘅 decay
if player_id not in self.players: return None # 玩家唔存在 → None
new = self.players[player_id]["score"] + delta # 舊分數加 delta(可以負數)
if new < 0: new = 0 # clamp
self.players[player_id]["score"] = new # 寫返新分數入去
return new # 答返新分數畀 caller
def get_rank(self, timestamp, player_id): # 查排名(1-based)
self._process_decay(timestamp) # 先處理到期嘅 decay
if player_id not in self.players: return None # 玩家唔存在 → None
items = [] # 裝所有人嘅 (id, score) tuple
for pid, pd in self.players.items(): # 逐個玩家攞分數
items.append((pid, pd["score"])) # 塞入 list
items.sort(key=lambda x: (-x[1], x[0])) # 分數高排先,同分按名排
for i in range(len(items)): # 行一次搵佢喺第幾
if items[i][0] == player_id: return i + 1 # 搵到就 return 名次(1-based)
return None # 理論上到唔到呢度
self.players = {
"alice": {"score": 500},
"bob": {"score": 300},
}
add_player("cathy") 後:
self.players["cathy"] = {"score": 0}
Step 1:add_player 起外層 player_id
Step 2:update_score 改 players[player_id]["score"]
Step 3:如果新分數跌穿 0,要 clamp 返做 0
Step 4:get_rank 要 sort 晒全部 player 先知自己排第幾
top_players(ts, n) → "pid(score)" list。players_above(ts, min) → pid list。def top_players(self, timestamp, n): # 排頭 N 個最高分嘅玩家
self._process_decay(timestamp) # 先處理到期嘅 decay
items = [] # 裝所有人嘅 (id, score)
for pid, pd in self.players.items(): # 逐個玩家
items.append((pid, pd["score"])) # 塞入 list
items.sort(key=lambda x: (-x[1], x[0])) # 分數高排先
result = [] # 開空 list 裝結果
for pid, sc in items[:n]: # 只攞頭 n 個
result.append(f"{pid}({sc})") # 砌做 "alice(500)" 形式
return result # 呢度返一張排好序/對齊 input 嘅清單,caller 可以直接逐格睇
def players_above(self, timestamp, min_score): # 搵分數 >= 某值嘅玩家
self._process_decay(timestamp) # 先處理到期嘅 decay
items = [] # 裝符合條件嘅人
for pid, pd in self.players.items(): # 逐個玩家睇
if pd["score"] >= min_score: # 分數夠唔夠高?
items.append((pid, pd["score"])) # 夠就入 list
items.sort(key=lambda x: (-x[1], x[0])) # 分數高排先
result = [] # 開空 list 裝結果
for pid, _ in items: # 逐個攞 id 出嚟
result.append(pid) # 只返 id
return result # 呢度返一張排好序/對齊 input 嘅清單,caller 可以直接逐格睇
self.players = {
"alice": {"score": 500},
"bob": {"score": 500},
"cathy": {"score": 200},
}
sort key = (-score, player_id)
所以同分會 alice 先過 bob
Step 1:掃晒 players 攞 (player_id, score)
Step 2:按 (-score, player_id) 排
Step 3:top_players 砌 "pid(score)"
Step 4:players_above 只拎 pid,唔使 format 分數
apply_decay(ts, interval) → 每 interval ms 全員 -10%。再 call 覆蓋。def apply_decay(self, timestamp, interval): # 設定 decay 間隔
self._process_decay(timestamp) # 先處理之前到期嘅 decay
self.decay_interval = interval # 記低間隔時間
self.decay_next_due = timestamp + interval # 設定下一次扣分嘅時間點
self.players = {
"alice": {"score": 500},
"bob": {"score": 300},
}
self.decay_interval = 1000
self.decay_next_due = 2000
ts=2500 時要補做一次:
alice 500 → 450
bob 300 → 270
Step 1:apply_decay 只係設定 interval 同 next_due
Step 2:真正扣分係每次 public method 入面 lazy call _process_decay
Step 3:while loop 補做所有過期 batch
Step 4:每次都係 score = score - score // 10
season_snapshot(ts) → deepcopy。season_restore(ts, snap_ts) → 恢復。Decay 唔恢復。def season_snapshot(self, timestamp): # 影一張排名 snapshot
self._process_decay(timestamp) # 先處理到期嘅 decay
self.snapshots.append((timestamp, copy.deepcopy(self.players))) # deepcopy 而家嘅 players 存入 snapshots list
return len(self.players) # return 玩家人數
def season_restore(self, timestamp, snapshot_timestamp): # 還原到某個 snapshot
self._process_decay(timestamp) # 先處理到期嘅 decay
chosen = None # 用嚟記搵到嘅最佳 snapshot
for ts, state in self.snapshots: # 逐個 snapshot 睇
if ts <= snapshot_timestamp: # 呢張相嘅時間係咪 <= 你想回帶嗰刻?
if chosen is None or ts > chosen[0]: # 搵最近嗰個(ts 最大嘅)
chosen = (ts, state) # 記低呢個 candidate
if chosen is None: return False # 搵唔到符合嘅 → False
self.players = copy.deepcopy(chosen[1]) # deepcopy snapshot 蓋過而家嘅 players
return True # 還原成功
self.players = {
"alice": {"score": 700},
"bob": {"score": 50},
}
self.snapshots = [
(3000, {"alice": {"score": 300}, "bob": {"score": 200}}),
(5000, {"alice": {"score": 700}, "bob": {"score": 50}}),
(8000, {"alice": {"score": 100}, "bob": {"score": 900}}),
]
season_restore(..., 6000)
→ 揀 5000 嗰張相還原
Step 1:season_snapshot 直接 deepcopy(self.players)
Step 2:season_restore 掃 snapshots,搵 ts <= snapshot_timestamp 入面最近嗰張
Step 3:restore 時 deepcopy 嗰張相蓋返 self.players
Step 4:因為冇 TTL,所以 snapshot 入面唔使轉 remaining_ttl
async def batch_operations(self, timestamp, operations): # 批量操作(lock per key + gather)
self._process_decay(timestamp) # 先處理到期嘅 decay
async def execute_op(op): # 處理單一 op(async inner function)
player_id = op["player_id"] # 每張 leaderboard 單都係圍住同一個 player_id 轉
lock = self.player_locks[player_id] # Pattern A:先拎返呢個 player 對應嗰把 lock
async with lock: # 同一個 player 嘅 update/get_rank/add 要排隊
if op["type"] == "update_score": # 判斷 op 類型:改分數
return self.update_score(timestamp, player_id, op["delta"]) # 派去 L1 update_score
elif op["type"] == "get_rank": # 查排名
return self.get_rank(timestamp, player_id) # 派去 L1 get_rank
elif op["type"] == "add_player": # 加新玩家
return self.add_player(timestamp, player_id) # 派去 L1 add_player
return None # 唔識嘅 type 一律答 None,保持 output 對位
tasks = [] # 先開個空 list,等陣逐項放 coroutine 入去
for op in operations: # 保持 input 順序逐張單包做 task
tasks.append(execute_op(op)) # 逐條 append,最後一次過 gather
results = await asyncio.gather(*tasks) # 砌 coroutine list + gather 同時跑
return list(results) # return 結果 list,順序同 input 一樣
self.players = {
"alice": {"score": 700},
"bob": {"score": 50},
}
self.player_locks = {
"alice": <asyncio.Lock>,
"bob": <asyncio.Lock>,
}
Pattern A:
update_score / get_rank / add_player
全部都係鎖 player_id
Step 1:每張 op 先抽出 player_id
Step 2:lock = self.player_locks[player_id]
Step 3:async with lock 入面 call 舊 sync method
Step 4:最後 gather 保持 input output 對位
async def sync_leaderboard(self, timestamp, sync_requests, max_concurrent): # 並發 sync(fail-fast)
self._process_decay(timestamp) # 先處理到期嘅 decay
sem = asyncio.Semaphore(max_concurrent) # 限制同時 N 個
async def do_sync(req): # 做一次 sync
player_id = req["player_id"] # 每張 sync 單都指定一個 player
if player_id not in self.players: return False # fail-fast:player 唔存在即走,唔好霸 sem 位
async with sem: # 合格先入 sem
await asyncio.sleep(0.01) # 模擬外部 API call
return True # sync 成功
tasks = [] # 先開個空 list,等陣逐項放結果或工作入去
for r in sync_requests: # 每個 request 包一個 coroutine
tasks.append(do_sync(r)) # 入 list,等陣一齊跑
results = await asyncio.gather(*tasks) # 砌 coroutine list + gather 同時跑
return list(results) # return 結果 list,保持同 input 一樣長
持久 state:
self.players = {
"alice": {"score": 700},
"bob": {"score": 50},
}
self.player_locks = defaultdict(asyncio.Lock)
臨時 runtime:
sem = asyncio.Semaphore(max_concurrent)
L6 冇新增 self.xxx field
只係 sync_leaderboard() 每次 call 自己開一個 sem
Step 1:player 唔存在先 fail-fast,未過關前唔好入 sem
Step 2:過關先 async with sem + sleep
Step 3:gather 收返同 input 對位嘅結果
Step 4:呢題 sync 只模擬外部 call,本身唔改 leaderboard state
── L1 CRUD ── 🟰 create_alert 同 Bank create_account 一樣 pattern ⚠️ acknowledge_alert Bank 冇對應(boolean toggle + 已 ack 就 False) 🟰 get_alert 同 Bank get_balance 類似(return dict or None) ── L2 Filter ── ⚠️ list_by_severity 用 severity filter,格式 "id(message)" ⚠️ list_unacknowledged 獨有(filter ack=False,格式 "id:severity") ── L3 Escalation ── ⚠️ set_escalation_deadline Bank cashback 加錢,呢度升 severity(唔同 logic) 🟰 _process_escalations 同 Bank _process_cashbacks 一樣 lazy pattern ── L4 Merge + History ── ⚠️ merge_alerts Bank merge 加 balance,呢度取 max severity + 串接 message 🟰 get_history 同 Workflow get_history 一樣(return list[str]) ── L5 Batch ── 🟰 batch_operations 同 Bank L5 一樣(lock per alert_id) ── L6 Send ── ⚠️ send_notifications ALL-SLEEP!全部 sem+sleep,sleep 完先 check(唔係 fail-fast)
好似 PagerDuty。建 alert(帶嚴重度 1-5)、ack 確認、太耐唔理就 auto-escalate severity。合併 alert、batch 操作、rate-limited 發送通知。
想像一個 alert dashboard:
┌─────────────────────────────────────────────────────────┐
│ alert_id message severity acked escalated │
│ "disk01" "disk full" 4 False False │
│ "cpu01" "high cpu" 2 True False │
│ "mem01" "low memory" 3 False True │
└─────────────────────────────────────────────────────────┘
每個 alert 有:
alert_id = 唯一 key("disk01")
message = 描述("disk full")
severity = 嚴重度 1-5(5 最嚴重)
acknowledged = 有冇人確認咗
escalated = 有冇自動升過級(每個 alert 最多升一次)
created_at = 建立時間
規則:
1. alert_id 唔可以重複(create 之前要 check)
2. acknowledge 已 ack 嘅 alert → return False
3. 超時未 ack → severity += 1(max 5),標 escalated=True
4. merge 將 source 併入 dest,刪 source
好似 PagerDuty 嘅 alert dashboard。建 alert(帶嚴重度 1-5)、ack 確認、太耐唔理就 auto-escalate severity。合併 alert、batch 操作、rate-limited 發送通知。
get_alert(t, "disk01") → {"message": "disk full", "severity": 4, ...}
get_alert(t, "zz") → None(唔存在)
list_by_severity(t, 3) → severity >= 3 嘅 alerts
→ ["disk01(disk full)", "mem01(low memory)"]
→ severity desc, tie-break id asc
list_unacknowledged(t) → 未 ack 嘅 alerts
→ ["disk01:4", "mem01:3"]
→ sorted by id asc
import asyncio
from collections import defaultdict
class NotificationSystem:
def __init__(self):
self.alerts = {} # L1 所有 alert(alert_id → info dict)
self.history = defaultdict(list) # L4 加:每個 alert_id 嘅事件記錄(獨立 dict)
self.escalations = {} # L3 加:per-alert deadline(alert_id → deadline_ms)
self.locks = defaultdict(asyncio.Lock) # L5 加:per-alert_id 嘅 async lock
self.alerts = {
"disk01": {
"message": "disk full",
"severity": 4,
"acknowledged": False,
"escalated": False,
"created_at": 100,
},
}
第一層 key = alert_id("disk01")。第二層係個 dict,存呢個 alert 嘅所有 info。
self.history = defaultdict(list)
self.history = {
"disk01": ["CREATED at 100", "ESCALATED at 5100"],
"cpu01": ["CREATED at 200", "ACKNOWLEDGED at 300"],
}
history 係獨立 dict,唔係塞喺 alert dict 入面。
defaultdict(list) → access 唔存在嘅 key 自動開 []。
L1:message, severity, acknowledged, escalated, created_at
L2:(冇加新 field,只係讀 severity / acknowledged)
L3:self.escalations = {} per-alert deadline
L4:self.history = defaultdict(list) event strings
L5:self.locks = defaultdict(asyncio.Lock)
L6:(冇加新 field,semaphore 喺 method 入面開)
lazy = 唔係 background timer,係所有需要見到「最新 active set」嘅 method 先 refresh;最常見係 public method 開頭,但如果 spec 另有明確 cleanup API,就由嗰個 API 觸發。逐個 alert check 有冇超時未 ack → severity += 1,最多升到 5,每個 alert 只升一次。
# Helper: _process_escalations — lazy 自動升級(每個 public method 開頭都 call)
def _process_escalations(self, timestamp): # 注意係 escalations 複數
for aid, alert in self.alerts.items(): # 逐個 alert 睇
if aid not in self.escalations: # 呢個 alert 冇設 deadline → 跳過
continue # 唔係每個 alert 都有 deadline
if alert["acknowledged"]: # 已經 ack 咗 → 唔使升
continue # ack 咗就安全
if alert["escalated"]: # 已經升過一次 → 唔再升
continue # 每個 alert 最多升一次
deadline = self.escalations[aid] # 攞呢個 alert 嘅 deadline timestamp
if timestamp >= deadline: # 當前時間已過 deadline → 要升
if alert["severity"] < 5: # severity 未到頂
alert["severity"] += 1 # 升一級(max 5)
alert["escalated"] = True # 標記已升(永遠唔會再升)
self.history[aid].append("ESCALATED at " + str(timestamp)) # 記入 history
_process_escalations(timestamp)
行一次 self.alerts
凡係:
1. 有設 deadline(aid in self.escalations)
2. 未 ack
3. 未升過
4. timestamp >= deadline
就 severity += 1(max 5),標 escalated=True
每個 public method 第一行都 call 一次(lazy 模式)
FS:過期 → 刪走(del self.files[path])
NF:過期 → 升級(severity += 1),唔刪
FS:全部 file 都有 expires_at
NF:只有被 set_escalation_deadline 嘅 alert 先有 deadline
共通點:都係 lazy(唔係 background timer)
都係每個 public method 開頭 call
第一次 timestamp >= deadline:
severity 3 → 4,escalated = True
第二次 call _process_escalations:
alert["escalated"] == True → continue
唔會再升
即使 severity 仲未到 5,都唔會再升
create = 建 alert acknowledge = 確認 get_alert = 攞 alert info severity = 1-5(5 最嚴重)
def create_alert(self, timestamp, alert_id, message, severity): # 建一個新 alert
self._process_escalations(timestamp) # 開頭先跑升級(公定模式)
if alert_id in self.alerts: # 重複 alert_id → 拒收
return False # 約定 return False
self.alerts[alert_id] = { # 開一格新 alert
"message": message, # 描述("disk full")
"severity": severity, # 嚴重度 1-5
"acknowledged": False, # 未確認
"escalated": False, # 未升過級
"created_at": timestamp, # 建立時間(L3 deadline 用)
}
self.history[alert_id].append("CREATED at " + str(timestamp)) # 記入獨立 history dict
return True # 建成功
# ⚠️ Bank 冇對應 — 獨有(boolean toggle + 已 ack 就 False)
def acknowledge_alert(self, timestamp, alert_id): # 確認收到(ack)
self._process_escalations(timestamp) # 開頭先跑升級
if alert_id not in self.alerts: # 唔存在
return False # 冇得 ack
if self.alerts[alert_id]["acknowledged"]: # 已經 ack 過
return False # 唔好重複 ack(呢個係 Bank 冇嘅 check)
self.alerts[alert_id]["acknowledged"] = True # 標記已確認
self.history[alert_id].append("ACKNOWLEDGED at " + str(timestamp)) # 記入 history
return True # ack 成功
# 🟰 同 Bank get_balance 類似(return dict or None)
def get_alert(self, timestamp, alert_id): # 攞 alert 資料
self._process_escalations(timestamp) # 開頭先跑升級(升完先攞,保證 severity 最新)
if alert_id not in self.alerts: # 唔存在
return None # 約定 return None(唔係 -1)
a = self.alerts[alert_id] # 攞個 alert dict
return { # 返一個 copy(唔畀外面直接改 internal state)
"message": a["message"], # 描述
"severity": a["severity"], # 嚴重度(可能已被 escalate)
"acknowledged": a["acknowledged"], # ack 狀態
"escalated": a["escalated"], # 升級狀態
"created_at": a["created_at"], # 建立時間
}
def __init__(self):
self.alerts = {}
self.history = defaultdict(list)
self.alerts = { 所有警報(alert_id → info dict)
"disk01": {
"message": "disk full", 描述
"severity": 4, 嚴重度 1-5
"acknowledged": False, 有冇人確認咗
"escalated": False, 有冇自動升過級
"created_at": 100, 建立時間
},
}
self.history = { 事件記錄(alert_id → list of strings)
"disk01": ["CREATED at 100"], 每次動作加一條
}
self.escalations = {} per-alert deadline(L3 先加)
self.locks = defaultdict(asyncio.Lock) per-alert_id 鎖(L5 先加)
alert dict 存狀態,history dict 存事件記錄。兩個 dict 用同一個 alert_id 做 key。
history 用 defaultdict(list) → 唔使 check key 存唔存在。
Bank 嘅 deposit/withdraw 冇「已做過」check。NF 嘅 acknowledge 有:已 ack → return False(唔好重複 ack)。
第一次 ack → True,改 acknowledged = True
第二次 ack → False,唔會改任何嘢
→ idempotent 嘅反面(每次結果唔同)
list_by_severity = 列出 severity >= min 嘅 alerts list_unacknowledged = 列出未 ack 嘅 alerts 兩個用唔同 format
def list_by_severity(self, timestamp, min_severity): # 列出 severity >= 某值嘅 alert
self._process_escalations(timestamp) # 開頭先跑升級(升完先 filter,severity 最新)
items = [] # 暫存符合條件嘅 (id, severity, message) tuple
for aid, a in self.alerts.items(): # 逐個 alert 睇
if a["severity"] >= min_severity: # severity >= 最低要求(注意係 >=,唔係 ==)
items.append((aid, a["severity"], a["message"])) # 入 tuple
items.sort(key=lambda x: (-x[1], x[0])) # severity desc,tie-break id asc
parts = [] # 砌 output 字串
for aid, sev, msg in items: # 逐個轉做 "id(message)"
parts.append(aid + "(" + msg + ")") # 格式:"disk01(disk full)"
return ", ".join(parts) # 用 ", " 連埋一齊
# ⚠️ 獨有(filter acknowledged=False + format "id:severity")
def list_unacknowledged(self, timestamp): # 列出未 ack 嘅 alert
self._process_escalations(timestamp) # 開頭先跑升級
items = [] # 暫存 (id, severity) tuple
for aid, a in self.alerts.items(): # 逐個 alert 睇
if not a["acknowledged"]: # 未 ack 嘅先入
items.append((aid, a["severity"])) # 入 tuple
items.sort(key=lambda x: x[0]) # sorted by id asc
parts = [] # 砌 output 字串
for aid, sev in items: # 逐個轉做 "id:severity"
parts.append(aid + ":" + str(sev)) # 格式:"disk01:4"(注意用 : 唔係括號)
return ", ".join(parts) # 用 ", " 連埋一齊
items.sort(key=lambda x: (-x[1], x[0])) 做咩?x 係 tuple,例如 ("disk01", 4, "disk full")。-x[1] = -4(加負號 → 高 severity 排最前),x[0] = "disk01"(tie-break 用)。sort 默認細嘅排先 → -severity 最細 = severity 最大 → 嚴重嘅排先。
三個 alert sort 前後:
("cpu01", 3, "high cpu") → (-3, "cpu01")
("disk01", 4, "disk full") → (-4, "disk01")
("mem01", 3, "low mem") → (-3, "mem01")
sort 後:
("disk01", 4, "disk full") -4 最細,排最前
("cpu01", 3, "high cpu") -3, "cpu01" c 排先
("mem01", 3, "low mem") -3, "mem01" m 排後
→ "disk01(disk full), cpu01(high cpu), mem01(low mem)"
list_by_severity → "id(message)" 用括號。list_unacknowledged → "id:severity" 用冒號。
點解唔同?list_by_severity 已經按 severity filter 咗,再顯示 severity 冇乜意義,所以顯示 message。list_unacknowledged 冇 filter severity,所以要顯示出嚟。
list_by_severity(t, min_severity=3)
severity 3 嘅 alert → 3 >= 3 → 入
severity 4 嘅 alert → 4 >= 3 → 入
severity 2 嘅 alert → 2 >= 3 → 唔入
呢個同 Bank 嘅 top_spenders 唔同:Bank 係 top-N(排名取頭 N 個),NF 係 threshold filter(>= 某個值嘅全部入)。
set_escalation_deadline = 設定某個 alert 幾時前要 ack deadline_ms = 超時嘅絕對 timestamp lazy = 唔係 timer,係 call method 嗰陣先 check
def set_escalation_deadline(self, timestamp, alert_id, deadline_ms): # 設升級期限
self._process_escalations(timestamp) # 開頭先跑升級(可能其他 alert 啱啱到期)
if alert_id not in self.alerts: # alert 唔存在
return False # 冇得設
self.escalations[alert_id] = deadline_ms # 記入 per-alert deadline dict
return True # 設定成功
def __init__(self):
self.alerts = {}
self.history = defaultdict(list)
self.escalations = {} ← 新增:per-alert deadline
self.alerts = { 所有警報(alert_id → info dict)
"disk01": {
"message": "disk full", 描述
"severity": 4, 嚴重度 1-5
"acknowledged": False, 未確認
"escalated": False, 未升過級
"created_at": 100, 建立時間
},
}
self.history = { 事件記錄(alert_id → list of strings)
"disk01": ["CREATED at 100"],
}
self.escalations = { per-alert 升級 deadline(L3 加)
"disk01": 5000, timestamp >= 5000 就升級
"cpu01": 8000, timestamp >= 8000 就升級
}
self.locks = defaultdict(asyncio.Lock) per-alert_id 鎖(L5 先加)
只有被 set_escalation_deadline 設定過嘅 alert 先有 entry。冇設嘅 alert → _process_escalations 會 skip。
FS L3:add_file_with_ttl(ts, path, size, ttl_ms)
→ expires_at = timestamp + ttl_ms(計相對時間)
→ 存入 file dict 入面
NF L3:set_escalation_deadline(ts, alert_id, deadline_ms)
→ deadline_ms 直接係絕對 timestamp
→ 存入獨立 self.escalations dict
共通點:到時間就觸發(FS 刪 file,NF 升 severity)
區別:FS 所有 TTL file 都有 expires_at field
NF 只有被設定嘅 alert 先有 deadline
t=100: create_alert(100, "disk01", "disk full", 3)
→ alerts["disk01"].severity = 3
t=200: set_escalation_deadline(200, "disk01", 5000)
→ escalations["disk01"] = 5000
t=4999: get_alert(4999, "disk01")
→ _process_escalations(4999)
→ 4999 < 5000 → 未到 → severity 仍然 3
t=5000: get_alert(5000, "disk01")
→ _process_escalations(5000)
→ 5000 >= 5000 → 到期!
→ severity 3 → 4,escalated = True
t=6000: get_alert(6000, "disk01")
→ _process_escalations(6000)
→ escalated == True → skip
→ severity 仍然 4(唔會再升)
merge = 將 source 併入 dest dest.severity = max(兩個) dest.message = dest + " | " + source 刪 source get_history 返 event string list
def merge_alerts(self, timestamp, source_id, dest_id): # 合併兩個 alert
self._process_escalations(timestamp) # 開頭先跑升級
if source_id not in self.alerts: # source 唔存在
return False # 冇得 merge
if dest_id not in self.alerts: # dest 唔存在
return False # 冇得 merge
src = self.alerts[source_id] # 攞 source alert dict
dst = self.alerts[dest_id] # 攞 dest alert dict
dst["severity"] = max(dst["severity"], src["severity"]) # 取較高嘅 severity
dst["message"] = dst["message"] + " | " + src["message"] # 串接 message(dest 行先)
del self.alerts[source_id] # 刪走 source
self.history[dest_id].append("MERGED " + source_id + " at " + str(timestamp)) # 記入 dest 嘅 history
return True # merge 成功
# 🟰 同 Workflow get_history 一樣(return list[str])
def get_history(self, timestamp, alert_id): # 攞狀態變更記錄
self._process_escalations(timestamp) # 開頭先跑升級
if alert_id not in self.alerts: # alert 唔存在
return None # 約定 return None
return list(self.history[alert_id]) # 返 copy(唔畀外面改 internal list)
將 source 嘅資料吸入 dest,然後殺死 source。做三件事:
合併前:
"disk01": {"severity": 4, "message": "disk full", ...}
"disk02": {"severity": 2, "message": "disk warn", ...}
call merge_alerts(t, "disk02", "disk01")
第一件:dest.severity = max(4, 2) = 4
第二件:dest.message = "disk full" + " | " + "disk warn"
第三件:del alerts["disk02"]
合併後:
"disk01": {"severity": 4, "message": "disk full | disk warn"}
disk02 已經冇咗
同 FS copy_file 嘅方向相反:FS 由 source 抄入 dest(source 保留),NF 由 source 吸入 dest 然後殺 source。
由 L1 開始每次都有寫 history(create_alert / acknowledge_alert 都 append)。
Return list[str](唔係 string)。alert 唔存在就 return None。
get_history(t, "disk01") → [
"CREATED at 100",
"ACKNOWLEDGED at 200",
"ESCALATED at 5000",
"MERGED disk02 at 6000",
]
batch = 一拼做幾單嘢 lock = 鎖 per-alert_id lock = 每個 alert_id 一把鎖 gather = 並發跑全部 task
async def batch_operations(self, timestamp, operations): # 批量操作(lock per key + gather)
self._process_escalations(timestamp) # 開頭先跑升級
async def execute_one(op): # 內嵌 async helper:做單個 op
op_type = op["type"] # 攞 op 類型
aid = op["alert_id"] # 攞 alert_id
async with self.locks[aid]: # 鎖呢個 alert_id(同一 id 嘅 op 排隊)
if op_type == "create": # create 類型
return self.create_alert( # 走返 L1 嘅 create_alert
timestamp, aid, op["message"], op["severity"]) # 將 op 入面條 alert 內容原封不動轉交畀 create_alert
elif op_type == "acknowledge": # acknowledge 類型
return self.acknowledge_alert( # 走返 L1 嘅 acknowledge_alert
timestamp, aid) # acknowledge 只需要對時同指出邊張 alert 單要確認
elif op_type == "get": # get 類型
return self.get_alert( # 走返 L1 嘅 get_alert
timestamp, aid) # get 只要時間同 alert id,就可以查返嗰張警報單
else: # 其他 type 唔 support
return None # 返 None
tasks = [] # 暫存所有 coroutine
for op in operations: # 逐個 op 包做 coroutine
tasks.append(execute_one(op)) # 入 list(未 await)
results = await asyncio.gather(*tasks) # 並發跑,等全部完,保留順序
return list(results) # 返一個同 operations 一樣長嘅 list
def __init__(self):
self.alerts = {}
self.history = defaultdict(list)
self.escalations = {}
self.locks = defaultdict(asyncio.Lock)
self.alerts = { 所有警報(alert_id → info dict)
"disk01": {
"message": "disk full", 描述
"severity": 4, 嚴重度
"acknowledged": False, 未確認
"escalated": False, 未升過級
"created_at": 100, 建立時間
},
}
self.history = { 事件記錄(alert_id → list of strings)
"disk01": ["CREATED at 100"],
}
self.escalations = { per-alert 升級 deadline
"disk01": 5000, 幾時要升級
}
self.locks = { per-alert_id 鎖(L5 加)
"disk01": <asyncio.Lock>, defaultdict 一 access 就自動造
"cpu01": <asyncio.Lock>,
}
每個 alert_id 一把獨立鎖。defaultdict 一 access 就自動造。兩個 op 鎖唔同 alert_id → 可以並行。兩個 op 鎖同一個 alert_id → 後嗰個會等。
FS L5 嘅 copy 涉及兩個 path → sorted lock 避死鎖。NF L5 每個 op 只涉及一個 alert_id → 唔使 sorted lock。
NF 嘅 merge 喺 L4 而唔係 L5。L5 嘅 batch 只做 create/ack/get(全部單 key),所以 NF L5 比 FS L5 簡單。
_process_escalations(timestamp)
間接 call(create_alert / acknowledge_alert / get_alert 第一行都 call)
batch_operations 本身都 call 一次(雙重保險)
無額外 helper(冇 sorted lock 因為只涉及單 key)
send = 發送通知 ALL-SLEEP = 全部 entry 都經過 sem + sleep,唔會 fail-fast sleep 完先 check alert 存唔存在 return "sent:id" / "failed:id"
async def send_notifications(self, timestamp, alert_ids, max_concurrent): # 並發 send(ALL-SLEEP pattern)
self._process_escalations(timestamp) # 開頭先跑升級
sem = asyncio.Semaphore(max_concurrent) # 開一個 N 位嘅 semaphore(同時最多 N 個)
tasks = [] # 暫存所有 coroutine task
for aid in alert_ids: # 逐個 alert_id 包做一個 task
task = self._do_one_send(timestamp, aid, sem) # 起 coroutine(未 await)
tasks.append(task) # 入 list
results = await asyncio.gather(*tasks) # 並發跑,等全部完,保留順序
return list(results) # 返 ["sent:disk01", "failed:zz", ...] 同 input 一樣長
async def _do_one_send(self, timestamp, alert_id, sem): # 做單一 send(async helper)
# ALL-SLEEP:唔好喺 sem 之前 check 任何嘢!全部入 sem
async with sem: # 攞 semaphore(限速)
await asyncio.sleep(0.01) # 模擬發送延遲(10ms)
# sleep 完先 check:呢個時間差係 ALL-SLEEP 嘅重點
if alert_id not in self.alerts: # sleep 完先 check alert 仲喺唔喺度
return "failed:" + alert_id # 唔存在 → "failed:disk01"
return "sent:" + alert_id # 存在 → "sent:disk01"
def __init__(self):
self.alerts = {}
self.history = defaultdict(list)
self.escalations = {}
self.locks = defaultdict(asyncio.Lock)
NF L6 係 ALL-SLEEP pattern,同 FS/Bank 嘅 fail-fast pattern 完全唔同。呢個係 Notification 最大嘅 L6 差異。
Fail-fast(FS, Bank, Hashring, ChatRoute 用呢個):
if source not in self.files: ← sem 之前 check
return False ← 即刻 fail,唔 acquire sem
async with sem: ← 過咗 check 先攞 sem
await asyncio.sleep(0.01)
return True
ALL-SLEEP(Notification 用呢個):
async with sem: ← 直接攞 sem(唔 check 先)
await asyncio.sleep(0.01) ← 全部都 sleep
if aid not in self.alerts: ← sleep 完先 check
return "failed:" + aid
return "sent:" + aid
結果差異好大:20 個 invalid alert_id + max_concurrent=2,fail-fast → 20 個即刻 return False(唔使等 semaphore);ALL-SLEEP → 20 個逐批 2 個 sleep → 全部都要等!
點解 NF 用 ALL-SLEEP?模擬真實世界:send notification 係向外部 service 發 request。你唔知個 alert 喺 send 途中會唔會被刪走。所以要 acquire sem → 排隊 → 真正 send(sleep)→ 先 check 結果。
FS L6 return:list[bool]
[True, False, True]
NF L6 return:list[str]
["sent:disk01", "failed:zz", "sent:cpu01"]
"sent:" prefix = 發送成功
"failed:" prefix = 發送失敗(alert 唔存在)
冒號後面跟 alert_id
_process_escalations(timestamp)
send_notifications 開頭 call 一次
_do_one_send(timestamp, alert_id, sem)
本 level 自家嘅 async helper
包住 sem + sleep + check(ALL-SLEEP 順序)
── L1 CRUD ── 🟰 create_session 同 Bank create_account 一樣(check exist → True/False) 🟰 get_session 同 Bank get_balance 類似(return user_id or "") 🟰 end_session 同 FS delete_file 一樣(del + True/False) ── L2 Sort ── 🟰 list_sessions 同 Workflow list_workflows 一樣(sort_by + format) 🟰 count_active 簡單 len after purge ── L3 TTL ── 🟰 create_session_with_ttl 同 FS add_file_with_ttl 一樣 ⚠️ refresh_session 獨有(更新 expires_at = ts + new_ttl) ── L4 Max Sessions ── ⚠️ set_max_sessions 類似 Hashring set_capacity,但 per-user ⚠️ create_session L4 版:超過 max 就踢最舊(LRU per user) ── L5 Batch ── 🟰 batch_operations 同 Bank L5 一樣(lock per session_id) ── L6 Sync ── 🟰 sync_sessions Fail-fast + sleep(同 Hashring L6)
想像你寫一個簡化版 session manager mock。每個 session 有 session_id("s1")、user_id("alice")、created_at 同 expires_at。要寫個 class 模擬增刪查、排序、TTL 過期、max sessions per user、async batch。
想像一個 session store:
┌───────────────────────────────────────────────┐
│ s1 user=alice created=100 expires=None │
│ s2 user=bob created=200 expires=None │
│ s3 user=alice created=300 expires=5300 (TTL) │
└───────────────────────────────────────────────┘
每個 session 有:
session_id = 個 session 嘅 key("s1")
user_id = 邊個 user("alice")
created_at = 幾時開(timestamp int)
expires_at = 幾時過期(None = 永遠唔過期)
規則:
1. session_id 唔可以重複(create 之前要 check)
2. TTL session 過期咗等同唔存在(lazy purge)
3. max sessions per user → 超過就 evict 最舊嗰個(LRU)
get_session(t, "s1") → "alice"
get_session(t, "s99") → ""(唔存在)
list_sessions(t, "id") → "s1(alice), s2(bob), s3(alice)"(alpha)
list_sessions(t, "created") → "s3(alice), s2(bob), s1(alice)"(newest first)
count_active(t) → 3
後面 level 加多啲嘢:
L2 加 sort/filter(list_sessions, count_active)
L3 加 TTL(create_session_with_ttl, refresh_session, lazy purge)
L4 加 set_max_sessions(per user limit, LRU eviction)
L5 加 async batch_operations(per-session_id lock)
L6 加 sync_sessions(rate-limited,semaphore)
import asyncio
from collections import defaultdict
class SessionManager:
def __init__(self):
self.sessions = {} # L1 所有 session(session_id → info dict)
self.max_per_user = None # L4 加:每個 user 最多幾個 session(None = 冇上限)
self.locks = defaultdict(asyncio.Lock) # L5 加:per-session_id 嘅 async lock
self.sessions = {
"s1": {"user_id": "alice", "created_at": 100, "expires_at": None},
"s2": {"user_id": "bob", "created_at": 200, "expires_at": None},
}
第一層 key = session_id("s1")
第二層係個 dict,存呢個 session 嘅 info
L1:user_id, created_at, expires_at 最基本(expires_at 預設 None)
L2:(冇加新 field,只係讀 existing fields)
L3:expires_at 會被 set 做數字 create_session_with_ttl 計 timestamp + ttl_ms
L4:self.max_per_user init 時加多個 None
L5:self.locks init 時加多個 defaultdict(asyncio.Lock)
L6:(冇加新 field,semaphore 喺 method 入面開)
# Helper: _purge_expired — lazy TTL 清過期 session(每個 public method 開頭都 call)
def _purge_expired(self, timestamp): # 唔係定時 task,係 lazy 模式
expired = [] # 暫存要刪嘅 session_id(唔可以一邊行一邊 del dict)
for sid, info in self.sessions.items(): # 逐個 session 睇
exp = info["expires_at"] # 攞 expires_at(可能係 None)
if exp is None: # None = 永遠唔過期
continue # 跳過唔睇
if timestamp >= exp: # 當前時間 >= expires_at → 過期
expired.append(sid) # 入到要刪 list
for sid in expired: # 逐個 del
del self.sessions[sid] # 真正刪走
_purge_expired(timestamp)
行一次 self.sessions
凡係 expires_at 不為 None 且 timestamp >= expires_at
就由 self.sessions 刪走
每個 public method 第一行都 call 一次(lazy 模式)
create = 開 session get = 查 user_id end = 結束 session True/False = 成功/失敗
def create_session(self, timestamp, session_id, user_id): # 開一個新 session
self._purge_expired(timestamp) # 開頭先清過期(公定模式)
if session_id in self.sessions: # 重複 session_id → 拒收
return False # 約定 return False
self.sessions[session_id] = { # 開一格新 session
"user_id": user_id, # 記低邊個 user
"created_at": timestamp, # 記低幾時開
"expires_at": None, # 冇 TTL = None(L3 嗰個 method 先會 set 數字)
}
return True # 開成功
# 🟰 同 Bank get_balance 類似(return user_id or "")
def get_session(self, timestamp, session_id): # 查 session 屬於邊個 user
self._purge_expired(timestamp) # 開頭先清過期(過期嘅 session 應該當唔存在)
if session_id not in self.sessions: # 唔存在
return "" # 約定 return 空字串(唔用 None,因為 spec 要 string)
return self.sessions[session_id]["user_id"] # 攞返 user_id
# 🟰 同 FS delete_file 一樣(del + True/False)
def end_session(self, timestamp, session_id): # 結束 session
self._purge_expired(timestamp) # 開頭先清過期
if session_id not in self.sessions: # 唔存在(或者已經 purge 走咗)
return False # 冇得 end
del self.sessions[session_id] # 真正刪走
return True # end 成功
def __init__(self):
self.sessions = {}
self.sessions = { 所有 session(session_id → info dict)
"s1": {
"user_id": "alice", 邊個 user
"created_at": 100, 幾時開
"expires_at": None, 永遠唔過期(L3 先會 set 數字)
},
}
self.max_per_user = None 每個 user 最多幾個 session(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-session_id 鎖(L5 先加)
expires_at 預設一律 None(L3 嗰個 create_session_with_ttl 先會 set 數字)。L1 雖然唔用 TTL,但個 field 一早預咗,方便 L3 直接 plug 入嚟。
_purge_expired(timestamp)
L1 入面所有 method 第一行都 call
L1 自己唔會產生 expired session(create_session 一律 expires_at=None)
但係要養成習慣,方便 L3 一加 TTL 就有效
list_sessions = 列晒所有 session sort_by = "id"(alpha asc)或 "created"(newest first,tie by id) count_active = 有幾多個 active
def list_sessions(self, timestamp, sort_by): # 列出所有 session
self._purge_expired(timestamp) # 開頭先清過期
items = [] # 暫存所有 (session_id, user_id, created_at) tuple
for sid, info in self.sessions.items(): # 逐個 session 攞出嚟
items.append((sid, info["user_id"], info["created_at"])) # 砌做 tuple
if sort_by == "created": # created 模式(newest first)
items.sort(key=lambda x: (-x[2], x[0])) # created desc,tie 用 id asc
else: # 預設 id 模式
items.sort(key=lambda x: x[0]) # 純 id asc(alpha 排序)
parts = [] # 砌 output 字串
for sid, uid, _ in items: # 逐個轉做 "session_id(user_id)"
parts.append(sid + "(" + uid + ")") # 砌單個 entry
return ", ".join(parts) # 用 ", " 連埋一齊
# 🟰 同 ChatRoute get_server_count 一樣簡單(len after purge)
def count_active(self, timestamp): # 數有幾多個 active session
self._purge_expired(timestamp) # 開頭先清過期(過期嘅唔計)
return len(self.sessions) # purge 完剩低幾個就返幾個
def __init__(self):
self.sessions = {}
同 L1 一樣,冇加新 field。
self.sessions = { 所有 session(session_id → info dict)
"s1": {
"user_id": "alice", 邊個 user
"created_at": 100, 幾時開
"expires_at": None, 永遠唔過期
},
"s2": {
"user_id": "bob",
"created_at": 200,
"expires_at": None,
},
"s3": {
"user_id": "alice",
"created_at": 300,
"expires_at": None,
},
}
self.max_per_user = None 每個 user 最多幾個 session(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-session_id 鎖(L5 先加)
items.sort(key=lambda x: (-x[2], x[0]))
lambda x: (-x[2], x[0]) 做咩?x 係一個 tuple,例如 ("s1", "alice", 100)。x[2] = 100(created_at),-x[2] = -100(加負號 → 大嘅變最細 → 排最前)。x[0] = "s1"(session_id,tie-break 用)。
即係 return 一個新 tuple:(-created_at, session_id)。sort 默認細嘅排先 → -created_at 最細 = created_at 最大 → newest first。同 created_at → 按 id 字母升序 tie-break。
每個 tuple 嘅 sort key:
("s1", "alice", 100) → (-100, "s1")
("s2", "bob", 200) → (-200, "s2")
("s3", "alice", 300) → (-300, "s3")
sort 後:
("s3", "alice", 300) -300 ← 最細,newest
("s2", "bob", 200) -200
("s1", "alice", 100) -100 ← oldest
→ "s3(alice), s2(bob), s1(alice)"
_purge_expired(timestamp)
list_sessions 同 count_active 開頭都要 call
過期 session 唔應該出現喺 list 或者 count 入面
TTL = time to live ttl_ms = 幾耐之後過期(毫秒) expires_at = 過期嘅絕對 timestamp refresh = 續期
def create_session_with_ttl(self, timestamp, session_id, user_id, ttl_ms): # 開 session + 設壽命
self._purge_expired(timestamp) # 開頭先清過期(可能 session_id 啱啱過期,可以重新開)
if session_id in self.sessions: # 同名 session 重複 → 拒收
return False # 唔好 silent overwrite
self.sessions[session_id] = { # 開新 session
"user_id": user_id, # 記低 user
"created_at": timestamp, # 記低幾時開
"expires_at": timestamp + ttl_ms, # 絕對過期時間 = 而家 + 壽命
}
return True # 開成功
# ⚠️ Bank/FS 冇對應 — 獨有(更新 expires_at = ts + new_ttl)
def refresh_session(self, timestamp, session_id, ttl_ms): # 續期
self._purge_expired(timestamp) # 開頭先清過期
if session_id not in self.sessions: # 唔存在(或者已經 purge 走咗)
return False # 冇得續
self.sessions[session_id]["expires_at"] = timestamp + ttl_ms # 由而家起重新計 TTL
return True # 續成功
def __init__(self):
self.sessions = {}
仲係冇加 instance var,TTL 資訊放入 session dict 入面。
self.sessions = { 所有 session(session_id → info dict)
"s1": {
"user_id": "alice", 邊個 user
"created_at": 100, 幾時開
"expires_at": None, 永遠唔過期
},
"s3": {
"user_id": "alice",
"created_at": 300,
"expires_at": 5300, 5300 ms 過期(create_session_with_ttl 加嘅)
},
}
self.max_per_user = None 每個 user 最多幾個 session(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-session_id 鎖(L5 先加)
expires_at 兩種值:
None → 永遠唔過期(create_session 加嘅)
int (ms) → timestamp >= 呢個值就過期(create_session_with_ttl 加嘅)
_purge_expired(timestamp)
L3 真正用得着佢,凡 expires_at 不為 None 且 timestamp 到位就刪
令所有 query method 自然唔見過期 session
create_session_with_ttl:session 唔存在 → 開新嘅,set expires_at = timestamp + ttl_ms。session 已存在 → return False。
refresh_session:session 存在 → 更新 expires_at = timestamp + ttl_ms(由而家起計)。session 唔存在 → return False。注意:purge 先,refresh 後。如果 session 啱啱過期被 purge 咗,refresh 都會 False。
set_max_sessions = 設上限 max_per_user = 每個 user 最多幾個 session LRU = 超過就踢走最舊嗰個(created_at 最細)
def set_max_sessions(self, timestamp, max_per_user): # 設每個 user 最多幾個 session
self._purge_expired(timestamp) # 開頭先清過期
self.max_per_user = max_per_user # 記低上限(之後每次 create 都會 check)
# ── L4 之後 create_session 要改 ──
# ⚠️ L4 版:create 超過 max 就踢最舊嘅 session
def create_session(self, timestamp, session_id, user_id): # 開一個新 session
self._purge_expired(timestamp) # 開頭先清過期
if session_id in self.sessions: # 重複 → 拒收
return False # 約定 return False
# ── L4 新增:check max_per_user ──
if self.max_per_user is not None: # 有設上限先做(None = 冇上限)
user_sessions = [] # 搵晒呢個 user 嘅 active sessions
for sid, info in self.sessions.items(): # 逐個 session 睇
if info["user_id"] == user_id: # 同一個 user
user_sessions.append((sid, info["created_at"])) # 記低 (session_id, created_at)
if len(user_sessions) >= self.max_per_user: # 已經到上限或者超過
user_sessions.sort(key=lambda x: (x[1], x[0])) # created_at asc,tie by id asc → 最舊排最前
oldest_sid = user_sessions[0][0] # 攞最舊嗰個 session_id
del self.sessions[oldest_sid] # 踢走(LRU eviction)
# ── 以下同 L1 一樣 ──
self.sessions[session_id] = { # 開一格新 session
"user_id": user_id, # 記低 user
"created_at": timestamp, # 記低幾時開
"expires_at": None, # 冇 TTL
}
return True # 開成功
def __init__(self):
self.sessions = {}
self.max_per_user = None
self.sessions = { 所有 session(session_id → info dict)
"s1": {
"user_id": "alice", 邊個 user
"created_at": 100, 幾時開
"expires_at": None, 永遠唔過期
},
"s3": {
"user_id": "alice",
"created_at": 300,
"expires_at": None,
},
}
self.max_per_user = 2 每個 user 最多 2 個 session(L4 加)
self.locks = defaultdict(asyncio.Lock) per-session_id 鎖(L5 先加)
呢個時候 alice 有 2 個 session(s1 同 s3)。如果 create 多一個 alice 嘅 session:
user_sessions = [("s1", 100), ("s3", 300)]
sort 後最舊 = ("s1", 100)
del self.sessions["s1"] ← evict 最舊
然後先加新嗰個
原本 L1 嘅 create_session:purge → check 重複 → 開 session → return True。
L4 之後嘅 create_session:purge → check 重複 → check max_per_user → 如果到上限 → 搵最舊 → evict → 開 session → return True。如果未到上限 → 直接開 session → return True。
eviction 邏輯:
1. 行一次 self.sessions 搵同一 user 嘅 session
2. sort by created_at asc,tie by id asc
3. del 第一個(最舊)
4. 然後先 create 新嘅
batch = 一拼做幾單嘢 lock = 鎖 per-session_id lock = 每個 session_id 一把鎖 defaultdict 一 access 就自動造
async def batch_operations(self, timestamp, ops): # 批量操作(lock per key + gather)
results = [] # 暫存每個 op 嘅結果
for op in ops: # 順住 input 順序逐個做
op_type = op["type"] # 攞 op 類型("create" / "get" / "end")
if op_type == "create": # create 類型
sid = op["session_id"] # 攞 session_id
uid = op["user_id"] # 攞 user_id
async with self.locks[sid]: # 鎖呢個 session_id
ok = self.create_session(timestamp, sid, uid) # 走返 L1/L4 嘅 create_session
results.append(ok) # 記返結果(True/False)
elif op_type == "get": # get 類型
sid = op["session_id"] # 攞 session_id
async with self.locks[sid]: # 鎖呢個 session_id
val = self.get_session(timestamp, sid) # 走返 L1 嘅 get_session
results.append(val) # 記返結果(user_id 或 "")
elif op_type == "end": # end 類型
sid = op["session_id"] # 攞 session_id
async with self.locks[sid]: # 鎖呢個 session_id
ok = self.end_session(timestamp, sid) # 走返 L1 嘅 end_session
results.append(ok) # 記返結果(True/False)
else: # 其他 type 唔 support
results.append(False) # 一律 False
return results # 返一個同 input 一樣長嘅 list
def __init__(self):
self.sessions = {}
self.max_per_user = None
self.locks = defaultdict(asyncio.Lock)
self.sessions = { 所有 session(session_id → info dict)
"s1": {
"user_id": "alice", 邊個 user
"created_at": 100, 幾時開
"expires_at": None, 永遠唔過期
},
}
self.max_per_user = 2 每個 user 最多幾個 session
self.locks = { per-session_id 鎖(L5 加)
"s1": <asyncio.Lock>, defaultdict 一 access 就自動造
"s2": <asyncio.Lock>,
}
defaultdict 一 access 就自動造。每個 session_id 一把獨立鎖。兩個 op 鎖唔同 session_id → 可以並行。兩個 op 鎖同一個 session_id → 後嗰個會等。
_purge_expired(timestamp)
間接 call(create_session / get_session / end_session 第一行都 call)
無額外 helper(鎖邏輯寫死喺 batch_operations 入面)
FS batch 嘅 copy 涉及兩個 path → 要 sorted lock 兩個(防 deadlock)。Session batch 嘅每個 op 只涉及一個 session_id → 鎖一把就夠。所以 Session batch 更簡單:冇 sorted lock 邏輯。
sync = 同步傳輸 semaphore = 信號燈(限制同時做嘅 task 數量) fail-fast = 一發現條件唔啱即刻 fail,唔等 semaphore
async def sync_sessions(self, timestamp, transfers, max_concurrent): # 並發 sync(fail-fast)
self._purge_expired(timestamp) # 開頭先清過期
sem = asyncio.Semaphore(max_concurrent) # 開一個 N 位嘅 semaphore(同時最多 N 個)
tasks = [] # 暫存所有 coroutine task
for transfer in transfers: # 逐個 transfer 包做一個 task
task = self._do_one_sync(timestamp, transfer, sem) # 起 coroutine(未 await)
tasks.append(task) # 入 list
results = await asyncio.gather(*tasks) # 並發跑,等全部完,保留順序
final = [] # 轉做正常 list
for r in results: # 逐個 copy 過
final.append(r) # 入 list
return final # 返一個同 transfers 一樣長嘅 list[bool]
async def _do_one_sync(self, timestamp, transfer, sem): # 做單一 transfer(async helper)
sid = transfer["session_id"] # 攞要 sync 嘅 session_id
# fail-fast:未攞 semaphore 之前已經 check(唔阻住其他 task)
if sid not in self.sessions: # session 唔存在
return False # 即刻 False,唔 acquire semaphore
async with sem: # 過咗 fail-fast 先攞 semaphore(限速)
await asyncio.sleep(0.01) # 模擬 sync 嘅延遲(10ms)
return True # sync 成功
def __init__(self):
self.sessions = {}
self.max_per_user = None
self.locks = defaultdict(asyncio.Lock)
同 L5 一樣,semaphore 喺 method 入面開(per-call)。
self.sessions = { 所有 session(session_id → info dict)
"s1": {
"user_id": "alice", 邊個 user
"created_at": 100, 幾時開
"expires_at": None, 永遠唔過期
},
}
self.max_per_user = 2 每個 user 最多幾個 session
self.locks = defaultdict(asyncio.Lock) per-session_id 鎖
semaphore 唔放入 self,每次 sync 重新開
同前完全一樣。semaphore 唔放入 self(每次 sync_sessions 都重新開一個 N 位)。
_purge_expired(timestamp)
sync_sessions 開頭 call 一次
_do_one_sync(timestamp, transfer, sem)
本 level 自家嘅 async helper,包住 fail-fast + semaphore + sleep
如果 session 唔存在,即刻 return False。唔 acquire semaphore → 唔阻住其他合法 task。
順序:check → acquire sem → sleep → return True
唔係:acquire sem → check → sleep → return True
後者會浪費 semaphore slot 喺一個注定 fail 嘅 task 上
想像你寫一個事件排程器 mock。每個 event 有 id、execute_at(幾時觸發)、status。到時間自動變 EXECUTED。Recurring event 會自動排下一次。
想像一個排程器:
┌─────────────────────────────────────────────────┐
│ "evt_1" execute_at=1000 status=SCHEDULED │
│ "evt_2" execute_at=2000 status=SCHEDULED │
│ "evt_3" execute_at=500 status=EXECUTED │
│ "rec_1" execute_at=3000 interval=1000 SCHEDULED│
└─────────────────────────────────────────────────┘
每個 event 有:
event_id = 個 event 嘅 key("evt_1")
execute_at = 幾時觸發(int ms timestamp)
status = "SCHEDULED" / "EXECUTED" / "DISPATCHED"
interval = None(one-shot)或者 int ms(recurring)
規則:
1. event_id 唔可以重複(schedule 之前要 check)
2. 到時間嘅 event lazy 觸發(每個 public method 開頭 call _process_pending)
3. one-shot event 觸發一次就 EXECUTED,唔會再觸發
4. recurring event 觸發後 execute_at += interval,status 重設 SCHEDULED
# 例:上面個排程器查一啲嘢
get_event(t=600, "evt_3") → {"execute_at":500, "status":"EXECUTED", "interval":None}
get_event(t=600, "evt_1") → {"execute_at":1000, "status":"SCHEDULED", "interval":None}
get_event(t=600, "zzz") → None(唔存在)
list_events(t, "time") → "evt_3(EXECUTED), evt_1(SCHEDULED), ..."(execute_at asc)
list_pending(t=600) → "evt_1(SCHEDULED), evt_2(SCHEDULED), rec_1(SCHEDULED)"
# 後面 level 加多啲嘢:
# L2 加 sort/filter(list_events, list_pending)
# L3 加 auto-trigger(_process_pending,lazy SCHEDULED→EXECUTED)
# L4 加 recurring(schedule_recurring, get_next_execution)
# L5 加 async batch_operations(per-event_id lock)
# L6 加 dispatch_events(rate-limited,semaphore + external_call)
import asyncio
from collections import defaultdict
class EventScheduler:
def __init__(self):
self.events = {} # L1 所有 event(event_id → info dict)
self.locks = defaultdict(asyncio.Lock) # L5 加:per-event_id 嘅 async lock
self.events = {
"evt_1": {"execute_at": 1000, "status": "SCHEDULED", "interval": None},
"rec_1": {"execute_at": 3000, "status": "SCHEDULED", "interval": 1000},
}
# 第一層 key = event_id("evt_1")
# 第二層係個 dict,存呢個 event 嘅 info
L1:execute_at, status, interval # 最基本(interval 預設 None)
L2:(冇加新 field,只係讀 status + execute_at)
L3:(冇加新 field,_process_pending 改 status)
L4:interval 由 None 變 int ms(recurring event)
L5:self.locks # init 時加多個 defaultdict(asyncio.Lock)
L6:(冇加新 field,semaphore 喺 method 入面開)
# Helper: _process_pending — lazy 觸發到時間嘅 event(每個 public method 開頭都 call)
def _process_pending(self, timestamp): # 唔係定時 task,係 lazy 模式
for eid, info in self.events.items(): # 逐個 event 睇
if info["status"] != "SCHEDULED": # 唔係 SCHEDULED → 跳過(已經 EXECUTED 或 DISPATCHED)
continue # 唔 process 第二次
if timestamp < info["execute_at"]: # 未夠時間
continue # 仲未到,跳過
# 到呢度 = SCHEDULED + 時間到
if info["interval"] is None: # one-shot event
info["status"] = "EXECUTED" # 直接變 EXECUTED,搞掂
else: # recurring event
info["status"] = "EXECUTED" # 先標記今次 EXECUTED
info["execute_at"] += info["interval"] # 下次觸發時間 = 而家 + interval
info["status"] = "SCHEDULED" # 重設返 SCHEDULED(等下一輪 lazy 再觸發)
_process_pending(timestamp)
行一次 self.events
凡係 status == "SCHEDULED" 且 timestamp >= execute_at
one-shot → status 改做 "EXECUTED"
recurring → execute_at += interval,status 重設 "SCHEDULED"
每個 public method 第一行都 call 一次(lazy 模式)
例:rec_1 execute_at=3000, interval=1000
timestamp=3500 嗰陣 call _process_pending:
3500 >= 3000 → 到時間
interval != None → recurring
execute_at = 3000 + 1000 = 4000(排下一次)
status 重設 SCHEDULED
timestamp=4500 再 call:
4500 >= 4000 → 又到時間
execute_at = 4000 + 1000 = 5000
status 重設 SCHEDULED
所以 recurring event 永遠係 SCHEDULED
每次到時間就自動順延到下一次
schedule = 排新 event cancel = 取消 get = 查狀態 True/False = 成唔成功 None = 唔存在
def schedule_event(self, timestamp, event_id, execute_at): # 排一個新 one-shot event
self._process_pending(timestamp) # 開頭先觸發到期嘅(公定模式)
if event_id in self.events: # 重複 id → 拒收
return False # 約定 return False
self.events[event_id] = { # 開一格新 event
"execute_at": execute_at, # 幾時觸發(絕對 timestamp)
"status": "SCHEDULED", # 初始狀態一律 SCHEDULED
"interval": None, # one-shot = None(L4 嗰個 method 先會 set 數字)
}
return True # 排成功
def cancel_event(self, timestamp, event_id): # 取消一個 event
self._process_pending(timestamp) # 開頭先觸發到期嘅
if event_id not in self.events: # 唔存在
return False # 冇得取消
del self.events[event_id] # 直接刪走(唔係改 status,係真刪)
return True # 取消成功
def get_event(self, timestamp, event_id): # 查某個 event 嘅狀態
self._process_pending(timestamp) # 開頭先觸發到期嘅(status 可能啱啱變咗)
if event_id not in self.events: # 唔存在
return None # 約定 None(唔用 -1,因為 spec 要 dict 或 None)
return dict(self.events[event_id]) # copy 一份返出去(唔好畀外面改到 internal state)
def __init__(self):
self.events = {}
self.events = {
"evt_1": {
"execute_at": 1000,
"status": "SCHEDULED",
"interval": None,
},
}
# interval 預設一律 None(L4 嗰個 schedule_recurring 先會 set 數字)
# L1 雖然唔用 recurring,但個 field 一早預咗,方便 L4 直接 plug 入嚟
_process_pending(timestamp)
L1 入面所有 method 第一行都 call
L1 自己排嘅 event 可能到時間要觸發
養成習慣,方便 L3 lazy trigger 有效
cancel_event 直接 del
唔係改 status 做 "CANCELLED"
因為 spec 冇 CANCELLED 呢個 status
三個合法 status:SCHEDULED / EXECUTED / DISPATCHED
cancel = 由 events dict 消失
list_events = 列晒所有 event sort_by = "id" 或 "time" list_pending = 只列 SCHEDULED 嘅 event
def list_events(self, timestamp, sort_by): # 列晒所有 event,按 id 或 time 排
self._process_pending(timestamp) # 開頭先觸發到期嘅
items = [] # 暫存所有 (event_id, info) tuple
for eid, info in self.events.items(): # 逐個 event 攞出嚟
items.append((eid, info)) # 砌做 tuple
if sort_by == "time": # time 模式
items.sort(key=lambda x: (x[1]["execute_at"], x[0])) # execute_at asc,tie 用 id asc
else: # 預設 id 模式
items.sort(key=lambda x: x[0]) # 純 id asc
parts = [] # 砌 output 字串
for eid, info in items: # 逐個轉做 "id(status)"
parts.append(eid + "(" + info["status"] + ")") # 砌單個 entry
return ", ".join(parts) # 用 ", " 連埋一齊
def list_pending(self, timestamp): # 只列 SCHEDULED 嘅 event
self._process_pending(timestamp) # 開頭先觸發到期嘅(到期嘅已經變 EXECUTED)
items = [] # 暫存 SCHEDULED 嘅 event
for eid, info in self.events.items(): # 逐個 event 睇
if info["status"] == "SCHEDULED": # 只要 SCHEDULED
items.append((eid, info)) # 入 list
items.sort(key=lambda x: (x[1]["execute_at"], x[0])) # 按 execute_at asc,tie 用 id asc
parts = [] # 砌 output 字串
for eid, info in items: # 逐個轉做 "id(SCHEDULED)"
parts.append(eid + "(" + info["status"] + ")") # 其實 status 一定係 SCHEDULED
return ", ".join(parts) # 用 ", " 連埋一齊
def __init__(self):
self.events = {}
# 同 L1 一樣,冇加新 field
# items.sort(key=lambda x: (x[1]["execute_at"], x[0]))
#
# lambda x: (x[1]["execute_at"], x[0]) 做咩?
# x 係一個 tuple,例如 ("evt_1", {"execute_at":1000, ...})
# x[1]["execute_at"] = 1000(觸發時間)
# x[0] = "evt_1"(id 字串,tie-break 用)
#
# 即係 return 一個新 tuple:(execute_at, id)
# sort 默認細嘅排先 →
# execute_at 最細 = 最早觸發嘅排先
# 同 execute_at → 按 id 字母升序 tie-break
# 例:
# ("evt_2", {execute_at: 1000}) → (1000, "evt_2")
# ("evt_1", {execute_at: 1000}) → (1000, "evt_1")
# ("evt_3", {execute_at: 2000}) → (2000, "evt_3")
# sort 後:
items = [
("evt_1", ...), # 1000, "evt_1" ← tie-break e1 先
("evt_2", ...), # 1000, "evt_2"
("evt_3", ...), # 2000 ← 時間大排尾
]
# 最終 output:
→ "evt_1(SCHEDULED), evt_2(EXECUTED), evt_3(SCHEDULED)"
list_events(t, sort_by)
列所有 event(任何 status)
sort_by="id" → id asc
sort_by="time" → execute_at asc, tie by id
list_pending(t)
只列 status == "SCHEDULED"
永遠按 execute_at asc, tie by id
EXECUTED / DISPATCHED 嘅唔出現
lazy = 用嗰陣先 check one-shot = 只觸發一次 SCHEDULED→EXECUTED = 到時間自動改 status
# _process_pending 已經喺上面 Helpers section 寫晒
# 呢度 recap 下 L3 加咗乜:
#
# 每個 public method 第一行:self._process_pending(timestamp)
# _process_pending 行一次 self.events:
# status == "SCHEDULED" 且 timestamp >= execute_at → 到時間
# interval is None → one-shot → status = "EXECUTED"
# interval is not None → recurring → execute_at += interval, 重設 SCHEDULED
# 例:one-shot 觸發
es = EventScheduler() # 將新值寫落呢格 state;等於而家正式更新咗紀錄
es.schedule_event(0, "evt_1", 1000) # 排 evt_1 喺 t=1000 觸發
es.get_event(500, "evt_1") # t=500 → 未到 → status 仲係 "SCHEDULED"
es.get_event(1000, "evt_1") # t=1000 → 到啦 → _process_pending 改做 "EXECUTED"
es.get_event(2000, "evt_1") # t=2000 → 仲係 "EXECUTED"(one-shot 唔會再改)
def __init__(self):
self.events = {}
# 仲係冇加 instance var,lazy 邏輯全部喺 _process_pending 入面
self.events = {
"evt_1": {"execute_at": 1000, "status": "SCHEDULED", "interval": None},
}
# _process_pending(t=1000) 之後:
self.events = {
"evt_1": {"execute_at": 1000, "status": "EXECUTED", "interval": None},
}
# status 變咗,execute_at 唔變(one-shot 嘅特徵)
_process_pending(timestamp)
L3 真正用得着佢
凡 status == "SCHEDULED" 且 timestamp 到位就改 status
one-shot → "EXECUTED"
recurring → execute_at += interval, 重設 "SCHEDULED"
recurring = 重複觸發 interval_ms = 每隔幾耐再觸發(毫秒) 觸發後 execute_at 自動加 interval,status 重設 SCHEDULED
def schedule_recurring(self, timestamp, event_id, execute_at, interval_ms): # 排一個重複 event
self._process_pending(timestamp) # 開頭先觸發到期嘅
if event_id in self.events: # 同名 event 重複 → 拒收
return False # 唔好 silent overwrite
self.events[event_id] = { # 開新 event
"execute_at": execute_at, # 第一次觸發時間
"status": "SCHEDULED", # 初始一律 SCHEDULED
"interval": interval_ms, # 每隔幾耐再觸發(唔係 None → _process_pending 會重設)
}
return True # 排成功
def get_next_execution(self, timestamp, event_id): # 查下一次幾時觸發
self._process_pending(timestamp) # 開頭先觸發到期嘅(execute_at 可能啱啱 update 咗)
if event_id not in self.events: # 唔存在
return None # 約定 None
return self.events[event_id]["execute_at"] # 返 execute_at(recurring 嘅話已經加咗 interval)
def __init__(self):
self.events = {}
# 仍然冇加 instance var
self.events = {
"rec_1": {"execute_at": 3000, "status": "SCHEDULED", "interval": 1000},
}
# schedule_recurring 完之後,同普通 event 住喺同一個 dict
# 唯一分別係 interval != None
# _process_pending(t=3500) 之後:
self.events = {
"rec_1": {"execute_at": 4000, "status": "SCHEDULED", "interval": 1000},
}
# execute_at 由 3000 → 4000(加咗 interval)
# status 重設返 SCHEDULED(永遠等下一次)
one-shot(interval = None):
觸發 → status = "EXECUTED",execute_at 唔變
永遠停留喺 "EXECUTED"
recurring(interval = int):
觸發 → execute_at += interval
status 重設 "SCHEDULED"
即係永遠唔會去到 "EXECUTED"(除非 cancel)
每次 _process_pending 都重新排下一次
get_next_execution(t=3500, "rec_1")
_process_pending(3500) 先跑
rec_1 execute_at=3000, 3500>=3000 → 觸發
execute_at = 3000 + 1000 = 4000
return 4000
# one-shot event 用 get_next_execution:
# 如果 status 已經係 "EXECUTED"
# execute_at 就係佢觸發嗰個時間(唔會再變)
batch = 一拼做幾單嘢 lock = 鎖 per-event_id lock = 每個 event_id 一把鎖 gather = 並發跑
async def batch_operations(self, timestamp, ops): # 一次過做一堆 schedule/cancel
results = [] # 暫存每個 op 嘅結果
async def do_one(op): # 做單個 op(async helper)
op_type = op["type"] # 攞 op 類型
eid = op["event_id"] # 攞 event_id
async with self.locks[eid]: # 鎖呢個 event_id(per-id lock)
if op_type == "schedule": # schedule 類型
return self.schedule_event( # 走返 L1 嘅 schedule_event
timestamp, eid, op["execute_at"]) # 將執行時間一齊轉交,正式排一個單次 event
elif op_type == "cancel": # cancel 類型
return self.cancel_event( # 走返 L1 嘅 cancel_event
timestamp, eid) # cancel 只要知道而家時間同邊個 event 要撤回
elif op_type == "recurring": # recurring 類型
return self.schedule_recurring( # 走返 L4 嘅 schedule_recurring
timestamp, eid, op["execute_at"], op["interval_ms"]) # 第一次時間加埋 interval 一齊交落去,正式排 recurring 任務
else: # 唔認識嘅 type
return False # 一律 False
tasks = [do_one(op) for op in ops] # 逐個 op 包做 coroutine
results = await asyncio.gather(*tasks) # 並發跑,等全部完,保留順序
return list(results) # 返一個同 input 一樣長嘅 list
def __init__(self):
self.events = {}
self.locks = defaultdict(asyncio.Lock)
self.events = {
"evt_1": {"execute_at": 1000, "status": "SCHEDULED", "interval": None},
}
self.locks = {
"evt_1": <asyncio.Lock>, # defaultdict 一 access 就自動造
"rec_1": <asyncio.Lock>,
}
# 每個 event_id 一把獨立鎖
# 兩個 op 鎖唔同 id → 可以並行
# 兩個 op 鎖同一個 id → 後嗰個會等
_process_pending(timestamp)
間接 call(schedule_event / cancel_event / schedule_recurring 第一行都 call)
無額外 helper(鎖邏輯寫死喺 batch_operations 入面)
FS 嘅 copy 涉及兩個 path → 要 sorted lock 避免 deadlock
Scheduler 每個 op 只涉及一個 event_id
schedule = 1 個 id,cancel = 1 個 id
唔會同時鎖兩個 id → 唔使 sorted lock
直接 async with self.locks[eid] 就夠
dispatch = 發送去外部 external_call = 外部 callback(async function) semaphore = 限制同時做嘅 task 數量 fail-fast = 唔係 EXECUTED 就 skip
async def dispatch_events(self, timestamp, event_ids, external_call, max_concurrent): # 並發 dispatch(semaphore)
self._process_pending(timestamp) # 開頭先觸發到期嘅
sem = asyncio.Semaphore(max_concurrent) # 開一個 N 位嘅 semaphore(同時最多 N 個)
async def do_one(eid): # 做單個 dispatch(async helper)
# fail-fast:未攞 semaphore 之前已經 check(唔阻住其他 task)
if eid not in self.events: # event 唔存在
return "skipped:" + eid # 即刻 skip,唔 acquire semaphore
if self.events[eid]["status"] != "EXECUTED": # 唔係 EXECUTED(SCHEDULED 或 DISPATCHED)
return "skipped:" + eid # fail-fast skip
async with sem: # 過咗 fail-fast 先攞 semaphore(限速)
try: # try 包住 external_call(外部可能 raise)
await external_call(eid) # 真正 call 外部(async function)
self.events[eid]["status"] = "DISPATCHED" # 成功 → 標記 DISPATCHED
return "dispatched:" + eid # 返成功
except Exception as e: # external_call 爆咗
return "error:" + eid + ":" + str(e) # 返 error + 原因
tasks = [do_one(eid) for eid in event_ids] # 逐個 id 包做 coroutine
results = await asyncio.gather(*tasks) # 並發跑,等全部完,保留順序
return list(results) # 返一個同 event_ids 一樣長嘅 list[str]
def __init__(self):
self.events = {}
self.locks = defaultdict(asyncio.Lock)
# 同 L5 一樣,semaphore 喺 method 入面開(per-call)
# dispatch_events 返一個 list[str],三種格式:
"dispatched:evt_1" # 成功 dispatch 咗
"skipped:evt_2" # fail-fast(唔存在 / 唔係 EXECUTED)
"error:evt_3:timeout" # external_call raise 咗 exception
# 例:
dispatch_events(t, ["evt_1","evt_2","evt_3"], my_callback, 2)
→ ["dispatched:evt_1", "skipped:evt_2", "error:evt_3:timeout"]
如果 event 唔存在或者 status 唔啱
直接 return "skipped:" 唔入 semaphore
因為 acquire semaphore 要排隊
唔 qualify 嘅 task 唔應該阻住合格嘅 task
順序:
1. 先 check eid 存在(唔存在 → skip)
2. 再 check status == "EXECUTED"(唔係 → skip)
3. 先過 fail-fast → acquire semaphore
4. try external_call → dispatched / error
dispatch 成功 → status 改做 "DISPATCHED"
DISPATCHED 係最終態(唔會再變)
one-shot event: SCHEDULED → EXECUTED → DISPATCHED
recurring event: 理論上 SCHEDULED → 每輪 _process_pending 觸發
但因為 recurring 永遠重設 SCHEDULED,唔會去到 EXECUTED
所以 dispatch 唔會作用喺 recurring event 上
除非有個 timing window 啱啱好 status 係 EXECUTED
── Helper ── 🟰 _is_alive 同 InMemDB 完全一樣 ── L1 CRUD ── ⚠️ add_record InMemDB set 冇 return,DNS return True + overwrite 🟰 resolve 同 InMemDB get(return "" if missing) 🟰 delete_record 同 InMemDB delete(清空就刪 domain) ── L2 Sort ── 🟰 scan_records 同 InMemDB scan ⚠️ list_domains InMemDB 冇對應(DNS 獨有) ── L3 TTL ── 🟰 add_record_with_ttl 同 InMemDB set_at_with_ttl 🟰 L3 版 resolve/delete/scan 各加一行 _is_alive check ── L4 Backup ── 🟰 backup 同 InMemDB 完全一樣(deepcopy + remaining_ttl) 🟰 restore 同 InMemDB 完全一樣(recalculate expiry) ── L5 Batch ── 🟰 batch_operations 同 InMemDB L5(lock per domain) ── L6 Sync ── 🟰 propagate_records Fail-fast + sleep(同 Hashring L6)
想像你寫一個簡化版 DNS resolver mock。每個 domain 可以有唔同 record type(A / AAAA / MX),每個 type 對一個 IP。要寫個 class 模擬增刪改查、排序、TTL 過期、backup/restore、async batch、rate-limited propagation。
┌───────────────────────────────────────────────────────┐
│ example.com A → 1.2.3.4 expiry=None │
│ example.com AAAA → ::1 expiry=None │
│ cdn.io A → 10.0.0.1 expiry=5000 │
│ mail.org MX → 192.168.1.1 expiry=3000 │
└───────────────────────────────────────────────────────┘
每條 record 有 4 個部分:
domain = 第一層 key("example.com")
record_type = 第二層 key("A"、"AAAA"、"MX")
ip = 呢條 record 嘅值("1.2.3.4")
expiry = 幾時過期(None = 永遠唔過期)
規則:
1. 同一個 domain + type → 直接覆蓋(overwrite variant)
2. TTL record 過期咗 → inline check 當唔存在(唔真刪)
3. delete 後如果 domain 冇任何 type → 刪埋成個 domain
DNS 就係一本電話簿。domain = 人名,record_type = 電話類型(A = 手機,AAAA = 座機),IP = 號碼。
Nested dict 結構同 InMemDB 完全一樣:
InMemDB: data[key][field] = {value, expiry}
DNS: records[domain][type] = {ip, expiry}
resolve(t, "example.com", "A") → "1.2.3.4"
resolve(t, "example.com", "MX") → ""(唔存在)
scan_records(t, "example.com") → "A(1.2.3.4), AAAA(::1)"
list_domains(t) → ["cdn.io", "example.com", "mail.org"]
後面 level 加多啲嘢:
L2 加 scan/list(scan_records, list_domains)
L3 加 TTL(add_record_with_ttl, _is_alive inline check)
L4 加 backup/restore(remaining_ttl pattern)
L5 加 async batch_operations(per-domain lock)
L6 加 propagate_records(rate-limited,semaphore)
import asyncio # L5/L6 async 要用
import copy # L4 backup deepcopy 要用
from collections import defaultdict # L5 per-domain lock 要用
class DNSRouter:
def __init__(self):
self.records = {} # L1 所有 record(domain → type → info dict)
self.backups = [] # L4 加:[(timestamp, snapshot)]
self.domain_locks = defaultdict(asyncio.Lock) # L5 加:per-domain 嘅 async lock
self.records = {
"example.com": {
"A": {"ip": "1.2.3.4", "expiry": None},
"AAAA": {"ip": "::1", "expiry": None},
},
"cdn.io": {
"A": {"ip": "10.0.0.1", "expiry": 5000},
},
}
第一層 key = domain("example.com")
第二層 key = record_type("A")
第三層係個 dict,存呢條 record 嘅 info
L1:ip, expiry=None 最基本
L2:(冇加新 field,只係讀 records)
L3:expiry = timestamp + ttl_ms None = 永遠唔過期;int = 幾時過期
L4:(冇加新 field,backup 用 remaining_ttl)
L5:self.domain_locks init 時加多個 defaultdict(asyncio.Lock)
L6:(冇加新 field,semaphore 喺 method 入面開)
def _is_alive(self, record_data, timestamp): # check 呢個 field 有冇過期
if record_data["expiry"] is None: # expiry = None → 冇 TTL → 永遠活
return True # 直接 return True
if timestamp < record_data["expiry"]: # 嚴格小於 → 仲未到期
return True # 仲活
return False # timestamp >= expiry → 死咗
同 InMemDB 嘅 _is_alive 完全一樣。
重點:timestamp < expiry(嚴格小於)。即係 timestamp == expiry 就已經死。
_is_alive(record_data, timestamp)
expiry = None → True(永遠活)
ts < expiry → True(未到期)
ts >= expiry → False(死咗)
例子:
expiry = 100
_is_alive(rd, 99) → 99 < 100 → True
_is_alive(rd, 100) → 100 < 100 → False(剛好到期 = 死)
_is_alive(rd, 101) → 101 < 100 → False
add = 加/覆蓋 resolve = 查 delete = 刪 overwrite = 已存在就蓋過(唔 return False)
def add_record(self, timestamp, domain, record_type, ip): # 加一條 DNS record(overwrite)
if domain not in self.records: # 呢個 domain 第一次見
self.records[domain] = {} # 開個空 inner dict(一行新嘅 domain)
self.records[domain][record_type] = { # 直接寫入(已存在就覆蓋)
"ip": ip, # 記低 IP
"expiry": None, # L1 冇 TTL → None(L3 先會 set 數字)
} # 冇 return(overwrite variant 唔理重複)
def resolve(self, timestamp, domain, record_type): # 查某 domain 某 type 嘅 IP
if domain not in self.records: # domain 唔存在
return "" # 約定 return 空 string(唔係 None!)
if record_type not in self.records[domain]: # domain 有但呢個 type 冇
return "" # 都係 return 空 string
return self.records[domain][record_type]["ip"] # 攞出 IP(唔係成個 dict)
# 🟰 同 InMemDB delete 一樣(清空就刪 domain)
def delete_record(self, timestamp, domain, record_type): # 刪某條 record
if domain not in self.records: # domain 唔存在
return False # 冇得刪
if record_type not in self.records[domain]: # domain 有但 type 冇
return False # 都冇得刪
del self.records[domain][record_type] # 刪走呢個 type
if not self.records[domain]: # 成個 domain 冇任何 type 了?
del self.records[domain] # 連 domain 都拆走(同 InMemDB delete 一模一樣)
return True # 刪成功
def __init__(self):
self.records = {} # domain → type → {ip, expiry}
self.records = { DNS 記錄(domain → type → {ip, expiry})
"example.com": {
"A": {"ip": "1.2.3.4", "expiry": None}, 永遠唔過期
"AAAA": {"ip": "::1", "expiry": None},
},
}
self.backups = [] 備份 list(L4 先加)
self.domain_locks = defaultdict(asyncio.Lock) per-domain 鎖(L5 先加)
vs InMemDB:
InMemDB: data[key][field] = {"value": v, "expiry": None}
DNS: records[domain][type] = {"ip": v, "expiry": None}
結構完全一樣,只係 field name 唔同。
overwrite variant:add_record 唔 check 重複,唔 return False。直接寫入蓋過去。同 InMemDB 嘅 set 一樣(Bank 嗰種先會 return False)。
scan_records = 列某 domain 嘅所有 type list_domains = 列所有 domain sorted alphabetically
def scan_records(self, timestamp, domain): # 列某 domain 所有 record type
if domain not in self.records: # domain 唔存在
return "" # 空 string
types = sorted(self.records[domain]) # 攞所有 type name 再排序(loop dict = loop keys)
if not types: # 空 list → 冇嘢
return "" # 空 string
parts = [] # 砌 output 字串
for t in types: # 逐個 type 行一次
ip = self.records[domain][t]["ip"] # 攞呢個 type 嘅 IP
parts.append(t + "(" + ip + ")") # 砌做 "A(1.2.3.4)" 格式
return ", ".join(parts) # 用 ", " 連埋一齊
# ⚠️ InMemDB 冇對應(DNS 獨有,但 pattern 簡單)
def list_domains(self, timestamp): # 列所有 domain
result = sorted(self.records) # sorted(dict) = sorted(dict.keys())
return result # 返 list of domain strings
def __init__(self):
self.records = {}
同 L1 一樣,冇加新 field
self.records["example.com"] 係:
{"A": {"ip":"1.2.3.4",...}, "AAAA": {"ip":"::1",...}}
types = sorted(self.records["example.com"])
sorted({"A": ..., "AAAA": ...})
loop dict = loop keys → ["A", "AAAA"]
sorted → ["A", "AAAA"](A 排先,AAAA 排後)
for t in types → 逐個行一次
t = "A" → ip = "1.2.3.4" → "A(1.2.3.4)"
t = "AAAA" → ip = "::1" → "AAAA(::1)"
parts = ["A(1.2.3.4)", "AAAA(::1)"]
", ".join(parts)
→ "A(1.2.3.4), AAAA(::1)"
同 InMemDB scan 完全一樣嘅 pattern。
InMemDB: "field(value), field(value)"
DNS: "type(ip), type(ip)"
TTL = time to live ttl_ms = 幾耐之後過期(毫秒) expiry = 過期嘅絕對 timestamp inline = 用嗰陣先 check,唔預先刪
def add_record_with_ttl(self, timestamp, domain, record_type, ip, ttl_ms): # 加 record + 設 TTL
if domain not in self.records: # domain 第一次見
self.records[domain] = {} # 開個空 inner dict
expiry = timestamp + ttl_ms # 絕對過期時間 = 而家 + 壽命
self.records[domain][record_type] = { # 直接寫入(已存在就覆蓋)
"ip": ip, # 記低 IP
"expiry": expiry, # 記低幾時死
} # 冇 return(同 add_record 一樣)
# 🟰 L1 resolve 加一行 _is_alive check
def resolve(self, timestamp, domain, record_type): # 查某 domain 某 type 嘅 IP
if domain not in self.records: # domain 唔存在
return "" # 空 string
if record_type not in self.records[domain]: # type 唔存在
return "" # 空 string
rd = self.records[domain][record_type] # 攞呢條 record 嘅 dict
if not self._is_alive(rd, timestamp): # 過期 → 當唔存在
return "" # 空 string
return rd["ip"] # 仲活 → return IP
# 🟰 L1 delete_record 加一行 _is_alive check
def delete_record(self, timestamp, domain, record_type): # 刪某條 record
if domain not in self.records: # domain 唔存在
return False # 失敗就返 False;caller 可以當今次要求冇落地
if record_type not in self.records[domain]: # type 唔存在
return False # 失敗就返 False;caller 可以當今次要求冇落地
rd = self.records[domain][record_type] # 攞呢條 record
if not self._is_alive(rd, timestamp): # 已死 = 當唔存在
return False # 死嘅 record 你 delete 唔到
del self.records[domain][record_type] # 刪走呢個 type
if not self.records[domain]: # 成個 domain 冇任何 type?
del self.records[domain] # 連 domain 都拆走
return True # 刪成功
# 🟰 L2 scan_records 加 _is_alive filter
def scan_records(self, timestamp, domain): # 列某 domain 所有 record type
if domain not in self.records: # domain 唔存在
return "" # 唔存在 → 空 string
types = [] # 攞活嘅 type name
for t in self.records[domain]: # 逐個 type 行一次
rd = self.records[domain][t] # 攞呢條 record
if self._is_alive(rd, timestamp): # 仲活?
types.append(t) # 留低
types.sort() # 按字母排序
if not types: # 冇活嘅 type
return "" # 唔存在 → 空 string
parts = [] # 砌 output
for t in types: # 逐個砌
ip = self.records[domain][t]["ip"] # 攞 IP
parts.append(t + "(" + ip + ")") # "A(1.2.3.4)"
return ", ".join(parts) # "A(1.2.3.4), AAAA(::1)"
# 🟰 L2 list_domains 加 _is_alive filter
def list_domains(self, timestamp): # 列所有 domain
result = [] # 暫存有活 record 嘅 domain
for domain in self.records: # 逐個 domain 行一次
has_alive = False # 呢個 domain 有冇活嘅 record?
for t in self.records[domain]: # 逐個 type check
rd = self.records[domain][t] # 攞 record dict
if self._is_alive(rd, timestamp): # 有一個活就夠
has_alive = True # 標記
break # 唔使再 check
if has_alive: # 有活 record
result.append(domain) # 入 list
result.sort() # 按字母排序
return result # 返 sorted list
def __init__(self):
self.records = {}
仲係冇加 instance var,TTL 資訊放入 record dict 入面
self.records = { DNS 記錄(domain → type → {ip, expiry})
"example.com": {
"A": {"ip": "1.2.3.4", "expiry": None}, 永遠唔過期
"AAAA": {"ip": "::1", "expiry": 5000}, 5000 ms 過期
},
}
self.backups = [] 備份 list(L4 先加)
self.domain_locks = defaultdict(asyncio.Lock) per-domain 鎖(L5 先加)
resolve — 加 _is_alive check,過期嘅 return ""
delete_record — 加 _is_alive check,死嘅 return False
scan_records — 只列活嘅 type
list_domains — 只列有活 record 嘅 domain
add_record_with_ttl(10, "cdn.io", "A", "10.0.0.1", 50)
expiry = 10 + 50 = 60
時間: 10 --- 30 --- 59 --- 60 --- 999
alive: Y Y Y N N
↑ 60 < 60? No → dead
resolve(59, "cdn.io", "A") → "10.0.0.1"
resolve(60, "cdn.io", "A") → ""
backup = 影張相 restore = 用張相還原 remaining_ttl = 仲剩幾耐(唔係絕對時間!)
def backup(self, timestamp): # 影一張 snapshot(deepcopy)
snapshot = {} # 張相(空)
count = 0 # 數幾多個 domain 有活嘅 record
for domain in self.records: # 逐個 domain 行
alive_types = {} # 呢個 domain 入面活嘅 record type
for rtype, rd in self.records[domain].items(): # 逐個 type 行
if self._is_alive(rd, timestamp): # 呢條 record 仲活?
remaining = None # 默認冇 TTL
if rd["expiry"] is not None: # 有 TTL?
remaining = rd["expiry"] - timestamp # 計仲剩幾耐
alive_types[rtype] = { # 影低呢條 record
"ip": rd["ip"], # 存 IP
"remaining_ttl": remaining, # 存仲剩幾耐(唔係 expiry!)
}
if alive_types: # 呢個 domain 有活嘅 record?
snapshot[domain] = alive_types # 放入相
count += 1 # 數多一個 domain
self.backups.append((timestamp, snapshot)) # 存張相入 backups list
return count # return 影咗幾多個 domain
# 🟰 同 InMemDB restore 完全一樣(recalculate expiry)
def restore(self, timestamp, backup_timestamp): # 還原到某個 snapshot
best = None # 記住最近嗰張相
for ts, snap in self.backups: # 逐張相睇
if ts <= backup_timestamp: # 呢張喺目標時間或之前?
if best is None or ts > best[0]: # 係最近嗰張?
best = (ts, snap) # 記住
if best is None: # 搵唔到任何相
return "" # 冇得還原
_, snapshot = best # 攞張相出嚟
self.records = {} # 清空成個 DNS table
count = 0 # 數還原咗幾多個 domain
for domain, types in snapshot.items(): # 逐個 domain 重建
self.records[domain] = {} # 開返呢個 domain
for rtype, rd in types.items(): # 逐個 type 重建
expiry = None # 默認冇 TTL
if rd["remaining_ttl"] is not None: # 有 TTL?
expiry = timestamp + rd["remaining_ttl"] # 重算幾時死
self.records[domain][rtype] = { # 放返入 DNS table
"ip": rd["ip"], # 抄 IP
"expiry": expiry, # 用新計嘅 expiry
}
if self.records[domain]: # 呢個 domain 有 record?
count += 1 # 數多一個
return str(count) # return string!"1" 唔係 1
def __init__(self):
self.records = {}
self.backups = [] # [(timestamp, snapshot)]
self.records = {
"example.com": {
"A": {"ip": "1.2.3.4", "expiry": None},
"AAAA": {"ip": "::1", "expiry": 60},
},
"cdn.io": {
"A": {"ip": "10.0.0.1", "expiry": 30},
},
}
example.com/A 永遠活
example.com/AAAA 仲剩 60-40=20
cdn.io/A 30<40 → 已死!唔影
snapshot = {
"example.com": {
"A": {"ip": "1.2.3.4", "remaining_ttl": None},
"AAAA": {"ip": "::1", "remaining_ttl": 20},
},
}
cdn.io 全部死,唔入 snapshot
return 1(得 1 個 domain 有活 record)
搵 backup:ts=40 <= 40 → 用呢個
清空 DNS table
逐個 record 重建
self.records = {
"example.com": {
"A": {"ip": "1.2.3.4", "expiry": None},
"AAAA": {"ip": "::1", "expiry": 220},
},
}
ts(200) + remaining(20) = 220
AAAA 又可以再活 20 個 time unit
return "1"(string!唔係 int)
batch = 一拼做幾單嘢 lock = 鎖 per-domain lock = 每個 domain 一把鎖
async def batch_operations(self, timestamp, ops): # 批量操作(lock per key + gather)
async def execute_op(op): # 做單一 op(async inner function)
domain = op.get("domain", "") # 攞邊個 domain
lock = self.domain_locks[domain] # 攞嗰個 domain 嘅鎖
async with lock: # 鎖住(同 domain 嘅 op 排隊)
if op["type"] == "add": # add 類型
self.add_record( # 走返 L1 嘅 add_record
timestamp, # 先帶 timestamp 入去,等 add_record 開頭可以按而家時間清過期資料
op["domain"], # 將呢張單指向邊個 domain 一齊轉交過去
op["record_type"], # 告訴 add_record 今次係 A / CNAME 定其他 record 類型
op["ip"], # 真正要寫入 DNS 嘅目標值,例如 IP 或 destination
)
return None # add return None(唔係 True/False)
elif op["type"] == "resolve": # resolve 類型
return self.resolve( # 走返 resolve
timestamp, # resolve 一樣先對時;TTL 題通常都靠呢個時刻決定條 record 仲活唔活
op["domain"], # resolve 要查邊個 domain,就由呢格帶入去
op["record_type"], # 指明想查邊一種 record type,避免同域名其他類型混埋
)
elif op["type"] == "delete": # delete 類型
return self.delete_record( # 走返 delete_record
timestamp, # delete_record 都先食 timestamp,避免刪緊一條其實已經過期消失嘅 record
op["domain"], # 指出要刪邊個 domain 名下嘅紀錄
op["record_type"], # 指明刪邊一種 record,避免一刀切錯其他類型
)
return None # 未知 type → None
tasks = [] # 暫存所有 coroutine
for op in ops: # 逐個 op 包做 task
tasks.append(execute_op(op)) # 入 list(未 await)
results = await asyncio.gather(*tasks) # 並發跑,等全部完,保留順序
return list(results) # gather 返 tuple,轉做 list
def __init__(self):
self.records = {}
self.backups = []
self.domain_locks = defaultdict(asyncio.Lock)
self.records = { DNS 記錄(domain → type → {ip, expiry})
"example.com": {
"A": {"ip": "1.2.3.4", "expiry": None},
},
}
self.backups = [ 備份 list(L4 加)
(50, {"example.com": {"A": {"ip": "1.2.3.4", "remaining_ttl": None}}}),
]
self.domain_locks = { per-domain 鎖(L5 加)
"example.com": <asyncio.Lock>, defaultdict 一 access 就自動造
"cdn.io": <asyncio.Lock>,
}
vs InMemDB L5:
InMemDB lock per key → DNS lock per domain。
InMemDB set return None → DNS add return None。
邏輯完全一樣,只係名唔同。
注意 add return None(唔係 True/False)。resolve return IP 或 ""。delete return True/False。
propagate = 傳播 DNS record 去其他 server semaphore = 限速 fail-fast = domain 唔存在就即刻 fail(唔阻住其他 task)
async def propagate_records(self, timestamp, domains, max_concurrent): # 並發 propagate(fail-fast)
sem = asyncio.Semaphore(max_concurrent) # 開一個 N 位嘅 semaphore(同時最多 N 個)
async def propagate_one(domain): # 做單一 domain 嘅 propagation
# fail-fast:未攞 semaphore 之前已經 check
if domain not in self.records: # domain 唔存在
return False # 即刻 False,唔 acquire semaphore
has_alive = False # check 有冇活嘅 record
for t in self.records[domain]: # 逐個 type check
rd = self.records[domain][t] # 攞 record dict
if self._is_alive(rd, timestamp): # 有一個活就得
has_alive = True # 標記
break # 唔使再 check
if not has_alive: # 全部死晒
return False # 即刻 False,唔攞 semaphore
async with sem: # 過咗 fail-fast 先攞 semaphore(限速)
await asyncio.sleep(0.01) # 模擬 propagation 嘅延遲(10ms)
return True # propagation 成功
tasks = [] # 暫存所有 coroutine
for domain in domains: # 逐個 domain 包做 task
tasks.append(propagate_one(domain)) # 入 list(未 await)
results = await asyncio.gather(*tasks) # 並發跑,等全部完,保留順序
final = [] # 轉做正常 list
for r in results: # 逐個 copy 過
final.append(r) # 入 list
return final # 返一個同 domains 一樣長嘅 list[bool]
def __init__(self):
self.records = {}
self.backups = []
self.domain_locks = defaultdict(asyncio.Lock)
同 L5 一樣,semaphore 喺 method 入面開(per-call)
self.records = { DNS 記錄(domain → type → {ip, expiry})
"example.com": {
"A": {"ip": "1.2.3.4", "expiry": None},
},
}
self.backups = [...] 備份 list(L4 加)
self.domain_locks = defaultdict(asyncio.Lock) per-domain 鎖(L5 加)
semaphore 唔放入 self,每次 propagate 重新開
fail-fast 點 work:
1. domain 唔存在 → 即刻 return False(唔攞 semaphore)
2. domain 所有 record 都死 → 即刻 return False(唔攞 semaphore)
3. 有活嘅 record → 攞 semaphore → sleep → True
return list[bool](唔係 dict!每個位置對應 domains 嘅同一個位置)。
_is_alive(record_data, timestamp)
propagate_one 入面 check record 活唔活
propagate_one(domain)
本 level 自家嘅 async helper,包住 fail-fast + semaphore + sleep
── Helper ── 🟰 _purge_expired 同 InMemDB/DNS 一樣 lazy pattern ── L1 CRUD ── ⚠️ grant 三層 nested dict(InMemDB 兩層)+ return True 🟰 check 同 InMemDB get 類似 ⚠️ revoke 清三層(user/resource 空就刪) ── L2 Filter ── 🟰 list_permissions 同 DNS scan_records 一樣(sorted + format) ⚠️ list_users_with_permission 獨有(反向查,scan 全部 user) ── L3 TTL ── 🟰 grant_with_ttl 同 DNS add_record_with_ttl 一樣 ── L4 Backup ── 🟰 backup 同 InMemDB/DNS 完全一樣 🟰 restore 同 InMemDB/DNS 完全一樣 ── L5 Batch ── 🟰 batch_operations 同 InMemDB L5(lock per user_id) ── L6 Sync ── 🟰 sync_permissions Fail-fast + sleep(同 Hashring L6)
想像你寫一個權限管理系統 mock。每個 user 可以擁有多個 resource 嘅多個 permission。要寫個 class 模擬授權、撤銷、查詢、過期、備份、async batch。
想像一張權限表:
┌───────────┬───────────┬────────────┬──────────┐
│ user_id │ resource │ permission │ TTL │
├───────────┼───────────┼────────────┼──────────┤
│ alice │ doc/1 │ read │ 永遠 │
│ alice │ doc/1 │ write │ 永遠 │
│ alice │ doc/2 │ read │ 5000ms │
│ bob │ doc/1 │ read │ 永遠 │
└───────────┴───────────┴────────────┴──────────┘
每條記錄有:
user_id = 邊個人
resource = 邊個資源(例如 "doc/1")
permission = 乜嘢權限(例如 "read", "write")
granted_at = 幾時俾嘅
expires_at = 幾時過期(None = 永遠唔過期)
規則:
1. grant 一律成功,重複就覆蓋(return True)
2. TTL permission 過期咗等同唔存在(lazy purge)
3. revoke 之後逐層 cleanup 空 dict
check(t, "alice", "doc/1", "read") → True
check(t, "alice", "doc/1", "delete") → False(冇呢個權限)
check(t, "bob", "doc/2", "read") → False(bob 冇 doc/2)
list_permissions(t, "alice")
→ "doc/1:read, doc/1:write, doc/2:read"
(resource 排先,同 resource 入面 permission 排先)
list_users_with_permission(t, "doc/1", "read")
→ ["alice", "bob"](sorted)
L2 加 sort/filter(list_permissions, list_users_with_permission)
L3 加 TTL(grant_with_ttl, lazy _purge_expired)
L4 加 backup/restore(deepcopy + remaining_ttl)
L5 加 async batch_operations(per-user_id lock)
L6 加 sync_permissions(rate-limited,semaphore,fail-fast)
import asyncio, copy
from collections import defaultdict
class PermissionACL:
def __init__(self):
self.permissions = {} # L1 三層 nested dict
self.backups = [] # L4 加:備份 list
self.locks = defaultdict(asyncio.Lock) # L5 加:per-user_id lock
self.permissions = {
"alice": {
"doc/1": {
"read": {"granted_at": 10, "expires_at": None},
"write": {"granted_at": 15, "expires_at": None},
},
"doc/2": {
"read": {"granted_at": 20, "expires_at": 5020},
},
},
"bob": {
"doc/1": {
"read": {"granted_at": 30, "expires_at": None},
},
},
}
第一層 key = user_id("alice")
第二層 key = resource("doc/1")
第三層 key = permission("read")
值 = {"granted_at": ts, "expires_at": None|int}
L1:granted_at, expires_at 最基本(expires_at 一律 None)
L2:(冇加新 field,只係讀 permissions)
L3:expires_at 有值了 None = 永遠唔過期;int = 幾時過期
L4:self.backups init 時加 list
L5:self.locks init 時加 defaultdict(asyncio.Lock)
L6:(冇加新 field,semaphore 喺 method 入面開)
# Helper: _purge_expired — lazy TTL 清過期 permission(每個 public method 開頭都 call)
# 🟰 同 InMemDB/DNS/FS 嘅 lazy purge 完全一樣 pattern
def _purge_expired(self, timestamp): # 清走過期 permission 記錄(lazy);等有人查 ACL 前先順手掃枯葉
empty_users = [] # 暫存要刪嘅 user_id
for user_id, resources in self.permissions.items(): # 逐個 user 行
empty_resources = [] # 暫存要刪嘅 resource
for resource, perms in resources.items(): # 逐個 resource 行
expired_perms = [] # 暫存要刪嘅 permission
for perm, info in perms.items(): # 逐個 permission 行
exp = info["expires_at"] # 攞 expires_at(可能 None)
if exp is not None and timestamp >= exp: # 有 TTL 且過咗期
expired_perms.append(perm) # 入到要刪 list
for perm in expired_perms: # 逐個 del permission
del perms[perm] # 真正刪走
if not perms: # resource 下面冇晒 permission
empty_resources.append(resource) # 入到要刪 list
for resource in empty_resources: # 逐個 del 空 resource
del resources[resource] # 逐層 cleanup
if not resources: # user 下面冇晒 resource
empty_users.append(user_id) # 入到要刪 list
for user_id in empty_users: # 逐個 del 空 user
del self.permissions[user_id] # 最外層 cleanup
_purge_expired(timestamp)
三層 loop:user → resource → permission
凡係 expires_at 不為 None 且 timestamp >= expires_at
就由 self.permissions 刪走
刪完之後逐層 cleanup 空 dict
每個 public method 第一行都 call 一次(lazy 模式)
假設 alice 只剩一個 permission(doc/2:read, expires_at=100)
purge at timestamp=100:
del permissions["alice"]["doc/2"]["read"]
permissions["alice"]["doc/2"] = {} ← 空
del permissions["alice"]["doc/2"]
permissions["alice"] = {} ← 空
del permissions["alice"]
唔 cleanup 嘅話:
permissions = {"alice": {"doc/2": {}}}
check("alice", "doc/2", "read") 唔會 crash
但 list_permissions("alice") 會出空嘅 resource
考試 assertion 會 fail
Python dict iteration 入面 del 會 raise RuntimeError。所以先收集要刪嘅 key,行完先刪。同 FS 嘅 _purge_expired 一樣嘅 pattern,但呢度有三層,所以要三個「暫存要刪」list。
grant = 授權(一律 True) check = 查有冇(True/False) revoke = 撤銷(True/False + 逐層 cleanup)
def grant(self, timestamp, user_id, resource, permission): # 授權(覆蓋舊嘅)
self._purge_expired(timestamp) # 開頭先清過期(公定模式)
if user_id not in self.permissions: # 呢個 user 第一次見
self.permissions[user_id] = {} # 開個空 dict
if resource not in self.permissions[user_id]: # 呢個 resource 第一次見
self.permissions[user_id][resource] = {} # 開個空 dict
self.permissions[user_id][resource][permission] = { # 寫入 permission(覆蓋舊嘅)
"granted_at": timestamp, # 記低幾時授權
"expires_at": None, # 冇 TTL = None(L3 先會 set 數字)
}
return True # 一律成功
# 🟰 同 InMemDB get 類似(check 三層 dict 存唔存在)
def check(self, timestamp, user_id, resource, permission): # 查某人有冇某權限
self._purge_expired(timestamp) # 開頭先清過期(過期嘅權限應該當冇)
if user_id not in self.permissions: # user 唔存在
return False # 冇
if resource not in self.permissions[user_id]: # resource 唔存在
return False # 冇
if permission not in self.permissions[user_id][resource]: # permission 唔存在
return False # 冇
return True # purge 完仲喺度 = 有效
# ⚠️ 同 InMemDB delete 類似但要清三層(user 空就刪 user,resource 空就刪 resource)
def revoke(self, timestamp, user_id, resource, permission): # 撤銷權限
self._purge_expired(timestamp) # 開頭先清過期
if user_id not in self.permissions: # user 唔存在
return False # 冇得撤
if resource not in self.permissions[user_id]: # resource 唔存在
return False # 冇得撤
if permission not in self.permissions[user_id][resource]: # permission 唔存在
return False # 冇得撤
del self.permissions[user_id][resource][permission] # 刪走呢個 permission
if not self.permissions[user_id][resource]: # resource 下面冇晒 permission?
del self.permissions[user_id][resource] # 刪埋 resource
if not self.permissions[user_id]: # user 下面冇晒 resource?
del self.permissions[user_id] # 刪埋 user
return True # 撤銷成功
def __init__(self):
self.permissions = {} user_id → resource → permission → info
self.permissions = { 權限表(user → resource → permission → info)
"alice": {
"doc/1": {
"read": {
"granted_at": 10, 幾時授權
"expires_at": None, 永遠唔過期(L3 先會 set 數字)
},
"write": {
"granted_at": 15,
"expires_at": None,
},
},
},
}
self.backups = [] 備份 list(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-user_id 鎖(L5 先加)
grant(10, "alice", "doc/1", "read") → True
grant(20, "alice", "doc/1", "read") → True(覆蓋,granted_at 變 20)
同 InMemDB 嘅 set 一樣 — 重複 key 直接蓋
對比 FS 嘅 add_file → 重複 return False(拒絕覆蓋)
revoke(t, "alice", "doc/1", "read")
del permissions["alice"]["doc/1"]["read"]
if permissions["alice"]["doc/1"] == {} → del
if permissions["alice"] == {} → del
點解要 cleanup:
唔 cleanup → permissions = {"alice": {"doc/1": {}}}
list_permissions("alice") 會出空嘅 resource
考試 assertion 會 fail
_purge_expired(timestamp)
L1 入面所有 method 第一行都 call
L1 自己唔會產生 expired permission(grant 一律 expires_at=None)
但係要養成習慣,方便 L3 一加 TTL 就有效
list_permissions = 某 user 嘅所有權限 list_users_with_permission = 邊啲 user 有某 resource 嘅某 permission
def list_permissions(self, timestamp, user_id): # 列某 user 嘅所有權限
self._purge_expired(timestamp) # 開頭先清過期
if user_id not in self.permissions: # user 唔存在
return "" # 冇權限 → 空 string
parts = [] # 暫存所有 "resource:permission" string
resources = self.permissions[user_id] # 攞呢個 user 嘅全部 resource
for resource in sorted(resources.keys()): # 逐個 resource(sorted asc)
perms = resources[resource] # 攞呢個 resource 嘅全部 permission
for perm in sorted(perms.keys()): # 逐個 permission(sorted asc)
parts.append(resource + ":" + perm) # 砌做 "resource:permission"
return ", ".join(parts) # 用 ", " 連埋一齊
# ⚠️ InMemDB/DNS 冇對應 — Permission 獨有(要 scan 全部 user)
def list_users_with_permission(self, timestamp, resource, permission): # 反向查:邊啲 user 有呢個權限
self._purge_expired(timestamp) # 開頭先清過期
result = [] # 暫存有呢個權限嘅 user_id
for user_id, resources in self.permissions.items(): # 逐個 user 行
if resource not in resources: # 呢個 user 冇呢個 resource
continue # 跳過
if permission not in resources[resource]: # 有 resource 但冇呢個 permission
continue # 跳過
result.append(user_id) # 有 → 記低 user_id
result.sort() # user_id sorted asc
return result # return sorted list
def __init__(self):
self.permissions = {}
同 L1 一樣,冇加新 field
self.permissions = {
"alice": {
"doc/1": {
"read": {"granted_at": 10, "expires_at": None},
"write": {"granted_at": 15, "expires_at": None},
},
"doc/2": {
"read": {"granted_at": 20, "expires_at": None},
},
},
"bob": {
"doc/1": {
"read": {"granted_at": 30, "expires_at": None},
},
},
}
resources = permissions["alice"]
sorted(resources.keys()) → ["doc/1", "doc/2"]
resource = "doc/1":
sorted(perms.keys()) → ["read", "write"]
→ "doc/1:read"
→ "doc/1:write"
resource = "doc/2":
sorted(perms.keys()) → ["read"]
→ "doc/2:read"
parts = ["doc/1:read", "doc/1:write", "doc/2:read"]
", ".join(parts)
→ "doc/1:read, doc/1:write, doc/2:read"
逐個 user 行:
alice: "doc/1" in resources? ✅
"read" in resources["doc/1"]? ✅ → append "alice"
bob: "doc/1" in resources? ✅
"read" in resources["doc/1"]? ✅ → append "bob"
result = ["alice", "bob"]
result.sort()
→ ["alice", "bob"]
_purge_expired(timestamp)
list_permissions 同 list_users_with_permission 開頭都要 call
過期權限唔應該出現喺 list 入面
TTL = time to live ttl_ms = 幾耐之後過期(毫秒) expires_at = 過期嘅絕對 timestamp lazy = 用嗰陣先 purge
def grant_with_ttl(self, timestamp, user_id, resource, permission, ttl_ms): # 授權 + 設 TTL
self._purge_expired(timestamp) # 開頭先清過期
if user_id not in self.permissions: # user 第一次見
self.permissions[user_id] = {} # 開個空 dict
if resource not in self.permissions[user_id]: # resource 第一次見
self.permissions[user_id][resource] = {} # 開個空 dict
self.permissions[user_id][resource][permission] = { # 寫入(覆蓋舊嘅)
"granted_at": timestamp, # 記低幾時授權
"expires_at": timestamp + ttl_ms, # 絕對過期時間 = 而家 + 壽命
}
return True # 一律成功
def __init__(self):
self.permissions = {}
仲係冇加 instance var,TTL 資訊放入 permission info dict 入面
self.permissions = { 權限表(user → resource → permission → info)
"alice": {
"doc/1": {
"read": {
"granted_at": 10, 幾時授權
"expires_at": None, 永遠唔過期(grant 加嘅)
},
"write": {
"granted_at": 20,
"expires_at": 5020, 5020 ms 過期(grant_with_ttl 加嘅)
},
},
},
}
self.backups = [] 備份 list(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-user_id 鎖(L5 先加)
grant_with_ttl(100, "alice", "doc/1", "temp", 50)
→ expires_at = 100 + 50 = 150
時間: 100 --- 130 --- 149 --- 150 --- 200
temp: ✅ ✅ ✅ ❌ ❌
↑ 150 >= 150 → 過期
check(149, "alice", "doc/1", "temp")
_purge_expired(149): 149 >= 150? No → 唔刪
permission 仲喺度 → True
check(150, "alice", "doc/1", "temp")
_purge_expired(150): 150 >= 150? Yes → 刪走
permission 唔見咗 → False
_purge_expired(timestamp)
L3 真正用得着佢
凡 expires_at 不為 None 且 timestamp >= expires_at 就刪
令 check、list_permissions 自然唔見過期權限
backup = deepcopy 成個 state restore = 搵返最近嗰個 backup 重建,TTL 權限重算 remaining_ttl
def backup(self, timestamp): # 影一張 snapshot(deepcopy)
self._purge_expired(timestamp) # 先清過期(唔影死嘅)
snapshot = copy.deepcopy(self.permissions) # deepcopy 成個 state
self.backups.append((timestamp, snapshot)) # 存入 backups list
return True # backup 成功
# 🟰 restore 主幹似 InMemDB/DNS,但入口改用 backup_id 揀第幾張相,再按當刻時間重算 expires_at
def restore(self, timestamp, backup_id): # 還原到某個 snapshot
if backup_id < 0 or backup_id >= len(self.backups): # backup_id 越界
return False # 冇呢張相
backup_ts, snapshot = self.backups[backup_id] # 攞出嗰張相同佢嘅 timestamp
restored = copy.deepcopy(snapshot) # deepcopy(唔好改原 snapshot)
# 重算 TTL:將 snapshot 入面嘅 expires_at 轉做 remaining_ttl 再轉返 new expires_at
for user_id, resources in restored.items(): # 逐個 user
for resource, perms in resources.items(): # 逐個 resource
for perm, info in perms.items(): # 逐個 permission
exp = info["expires_at"] # 攞 snapshot 時嘅 expires_at
if exp is not None: # 有 TTL
remaining = exp - backup_ts # 計返 snapshot 時仲剩幾耐
info["expires_at"] = timestamp + remaining # 由而家起再撐 remaining ms
self.permissions = restored # 用重建嘅 state 覆蓋
return True # 還原成功
def __init__(self):
self.permissions = {}
self.backups = [] list of (timestamp, snapshot)
self.permissions = {
"alice": {
"doc/1": {
"read": {"granted_at": 10, "expires_at": None},
"write": {"granted_at": 50, "expires_at": 200},
},
},
}
read: 永遠唔過期
write: 200 過期,仲剩 200 - 100 = 100 ms
deepcopy 成個 self.permissions
snapshot 同原本一模一樣
backups[0] = (100, snapshot)
return True
backup_id = 0 → 攞 backups[0]
backup_ts = 100, snapshot = {...}
deepcopy snapshot → restored
逐個 permission 重算 TTL:
read: expires_at = None → 唔改
write: expires_at = 200
remaining = 200 - 100 = 100
new expires_at = 500 + 100 = 600
最終 self.permissions:
{
"alice": {
"doc/1": {
"read": {"granted_at": 10, "expires_at": None},
"write": {"granted_at": 50, "expires_at": 600},
},
},
}
write 又可以再活 100 ms(到 600 先過期)
return True
同 InMemDB 比較:邏輯完全一樣,只係實作方式唔同。
InMemDB:
backup 計 remaining = expiry - timestamp,存入 snapshot
restore 計 new_expiry = restore_ts + remaining_ttl
Permission:
backup 用 deepcopy(snapshot 入面已經有 expires_at)
restore 用 expires_at - backup_ts 計 remaining
再 timestamp + remaining 計 new expires_at
InMemDB 存 remaining_ttl;Permission 存原 expires_at 再算
batch = 一拼做幾單嘢 lock = per-user_id lock gather = 並發跑全部
async def batch_operations(self, timestamp, ops): # 批量操作(lock per key + gather)
async def execute_op(op): # 每個 op 嘅 async wrapper
user_id = op["user_id"] # 攞 user_id(用嚟鎖)
async with self.locks[user_id]: # 鎖呢個 user
if op["type"] == "grant": # grant 類型
return self.grant(timestamp, user_id, # 走返 L1 嘅 grant
op["resource"], op["permission"]) # 將資源名同權限名一齊轉交;呢兩格先決定授權落邊道門
elif op["type"] == "check": # check 類型
return self.check(timestamp, user_id, # 走返 L1 嘅 check
op["resource"], op["permission"]) # check 都要靠呢兩格,先知係查邊道資源、邊種權限
elif op["type"] == "revoke": # revoke 類型
return self.revoke(timestamp, user_id, # 走返 L1 嘅 revoke
op["resource"], op["permission"]) # revoke 一樣要講清楚係邊道資源、撤走邊種權限
return None # 其他 type → None
tasks = [execute_op(op) for op in ops] # 起晒全部 coroutine
results = await asyncio.gather(*tasks) # 並發跑,等全部完,保留順序
return list(results) # tuple → list
def __init__(self):
self.permissions = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
self.locks = {
"alice": <asyncio.Lock>,
"bob": <asyncio.Lock>,
}
defaultdict 一 access 就自動造
每個 user_id 一把獨立鎖
兩個 op 鎖唔同 user → 可以並行
兩個 op 鎖同一個 user → 後嗰個會等
對比其他 mock:FS lock per path、InMemDB lock per key、Permission lock per user_id。概念一模一樣,只係鎖嘅粒度唔同。
ops = [
{"type":"grant","user_id":"alice","resource":"doc/1","permission":"read"},
{"type":"check","user_id":"bob","resource":"doc/1","permission":"read"},
{"type":"revoke","user_id":"alice","resource":"doc/1","permission":"write"},
]
alice 嘅兩個 op(grant + revoke)會互相等(同一把鎖)
bob 嘅 check 同 alice 嘅 op 可以並行(唔同鎖)
→ [True, False, True]
_purge_expired(timestamp)
間接 call(grant / check / revoke 第一行都 call)
無額外 helper(lock 邏輯寫死喺 batch_operations 入面)
sync = 同步傳輸 semaphore = 信號燈(限同時做嘅 task 數量) fail-fast = 條件唔啱即刻 fail,唔等 semaphore
async def sync_permissions(self, timestamp, transfers, max_concurrent): # 並發 sync(fail-fast)
self._purge_expired(timestamp) # 開頭先清過期
sem = asyncio.Semaphore(max_concurrent) # 開一個 N 位嘅 semaphore
tasks = [] # 暫存所有 coroutine task
for transfer in transfers: # 逐個 transfer 包做一個 task
task = self._do_one_sync(timestamp, transfer, sem) # 起 coroutine(未 await)
tasks.append(task) # 入 list
results = await asyncio.gather(*tasks) # 並發跑,等全部完,保留順序
return list(results) # return list[bool]
async def _do_one_sync(self, timestamp, transfer, sem): # 做單一 transfer(async helper)
user_id = transfer["user_id"] # 攞 user_id
resource = transfer["resource"] # 攞 resource
permission = transfer["permission"] # 攞 permission
# fail-fast:未攞 semaphore 之前已經 check(唔阻住其他 task)
if user_id not in self.permissions: # user 唔存在
return False # 即刻 False,唔 acquire semaphore
if resource not in self.permissions[user_id]: # resource 唔存在
return False # 即刻 False
if permission not in self.permissions[user_id][resource]: # permission 唔存在
return False # 即刻 False
async with sem: # 過咗 fail-fast 先攞 semaphore(限速)
await asyncio.sleep(0.01) # 模擬 sync 延遲(10ms)
return True # sync 成功
def __init__(self):
self.permissions = {}
self.backups = []
self.locks = defaultdict(asyncio.Lock)
同 L5 一樣,semaphore 喺 method 入面開(per-call)
user 必須存在 + 必須有指定嘅 permission。唔滿足即刻 return False,唔 acquire semaphore。滿足先攞 semaphore,然後 sleep 模擬延遲。
點解 fail-fast 放喺 semaphore 之前:semaphore 位有限(例如 max_concurrent=2)。如果失敗嘅 transfer 都要排隊攞 semaphore,就會阻住後面成功嘅 transfer。fail-fast = 唔合資格嘅即走,唔佔位。
self.permissions = {
"alice": {"doc/1": {"read": {...}}},
"bob": {"doc/1": {"read": {...}}},
}
transfers = [
{"user_id":"alice","resource":"doc/1","permission":"read"},
{"user_id":"nobody","resource":"doc/1","permission":"read"},
{"user_id":"bob","resource":"doc/1","permission":"read"},
{"user_id":"alice","resource":"doc/1","permission":"delete"},
]
max_concurrent = 2
transfer 0: alice 有 doc/1:read → ✅ 攞 sem → sleep → True
transfer 1: nobody 唔存在 → ❌ 即走 → False(唔 sleep)
transfer 2: bob 有 doc/1:read → ✅ 攞 sem → sleep → True
transfer 3: alice 冇 doc/1:delete → ❌ 即走 → False(唔 sleep)
→ [True, False, True, False]
Permission L6: return list[bool]
同 FS L6(sync_files)一樣 — return list
DNS L6: return dict {domain: bool}
InMemDB L6: return dict {key: scan_result}
考試要睇清楚 spec 要 list 定 dict
4 個 transfer, max_concurrent=2
2 個 fail-fast(即走)+ 2 個成功(sleep 0.01)
fail-fast 唔佔 semaphore 位 →
兩個成功嘅可以同時 sleep →
total ≈ 0.01 秒(唔係 0.02)
如果冇 fail-fast(all-sleep):
4 個都要 sleep,max_concurrent=2
→ 兩輪 × 0.01 = 0.02 秒
考試 timing assertion 會 check 呢個差
Round-robin 輪流派 request。Health check 踢走唔健康嘅 server。Sticky session 記住 user 上次去邊間。
add_server(ts, id, weight)。route_request(ts, req_id) → round-robin 揀 server,return server_id。get_request_count(ts, id)。self.servers = {} # server_id → {weight, request_count, last_heartbeat}
self.rr_index = 0 # round-robin 指針
def add_server(self, timestamp, server_id, weight): # 開新舖入條街
if server_id in self.servers: return False # server_id 已經存在
self.servers[server_id] = {"weight": weight, "request_count": 0, "last_heartbeat": timestamp} # 開一個新 entry
self.rr_index = 0 # server 變動 → reset index
return True # 成功就返 True;caller 可以當今次動作真係做咗
def route_request(self, timestamp, request_id): # 純查詢:request 應該去邊間
healthy = self._healthy_servers(timestamp) # sorted by id
if not healthy: return None # 空嘅話
if self.rr_index >= len(healthy): self.rr_index = 0 # 呢度係分流位;條件唔同就會走去唔同分支
picked = healthy[self.rr_index] # 攞 healthy 入面嘅值
self.rr_index += 1 # 將新值寫落呢格 state;等於而家正式更新咗紀錄
self.servers[picked]["request_count"] += 1 # 將新值寫落呢格 state;等於而家正式更新咗紀錄
return picked # 返 picked
Round-robin: servers sorted alphabetically
index 指住下一個,wrap around
add/remove server → reset index = 0
route 返 server_id 唔係 bool
heartbeat(ts, id) → 記錄 server 活住。set_health_timeout(ts, timeout) → ts - last_heartbeat > timeout → unhealthy。def _is_healthy(self, server_id, timestamp): # check server 有冇 heartbeat
if self.health_timeout is None: return True # 將計好嘅值交返 caller;之後外面會靠呢個結果再判斷
info = self.servers.get(server_id) # 攞 servers 嘅值(冇就用 default)
if info is None: return False # 唔存在
return (timestamp - info["last_heartbeat"]) <= self.health_timeout # 返 tuple
Lazy health check — 唔係 TTL
<= timeout = healthy(唔同 TTL 嘅 <)
route_request 只揀 healthy servers
route_user_request(ts, req_id, user_id) → 有 sticky 且 healthy → 用返舊 server。否則 round-robin + 建新 session。self.sticky = {} user_id → server_id
1. 有 sticky + server healthy → 用返佢
2. 冇 sticky / unhealthy → round-robin failover
3. 記錄 session_history: [(server_id, count)]
failover = 舊 session 存入 history
開新 session,count 從 1 開始
get_session_history 返回全部(包括而家)
async def health_check_external(self, timestamp, checks, max_concurrent): # 並發做 health check(fail-fast)
sem = asyncio.Semaphore(max_concurrent) # 准考證 N 張
async def do_check(check): # 做一次 health check
if check["server_id"] not in self.servers: return False # fail-fast
async with sem: # 入 sem(等位)
await asyncio.sleep(0.01) # 呢段 sleep 係模擬外部檢查/同步真係要等一陣,唔係即刻有結果
self.servers[check["server_id"]]["last_heartbeat"] = timestamp # 檢查成功先補返最新心跳;即係話呢間舖啱啱仲有回應
return True # 成功就返 True;caller 可以當今次動作真係做咗
tasks = [] # 先開個空 list,等陣逐項放結果或工作入去
for c in checks:
tasks.append(do_check(c))
results = await asyncio.gather(*tasks)
return list(results) # gather 返嚟嗰批結果轉成普通 list;caller 之後比較易直接用
fail-fast + 成功後更新 heartbeat
新嘢:sleep 完有 side effect(唔止 return)
固定 capacity 嘅 cache。set/get/delete。滿咗就 evict last_access 最舊嘅。
__init__(capacity)。set(key, val) → 已存在就覆蓋(唔 evict)。新 key + 滿 → evict LRU。get(key) → access_count++。self.data = {} # 雪櫃入面每格貨:除咗 value,仲要記幾時掂過同會唔會過期
self.capacity = capacity # 記低雪櫃總共有幾多格位;之後 set 要靠呢個數決定要唔要踢人
self.eviction_history = [] # 被踢走名單;之後想追查邊件貨犧牲過就睇呢本簿
def set(self, key, value): # 擺貨入雪櫃;舊貨直接換包裝,新貨爆位先踢最耐冇人理嗰件
if key in self.data: # 呢格本身已經有貨;今次唔使搶位,只係換內容
self.data[key]["value"] = value # 原位覆蓋貨品內容;access 計數同 last_access 唔會因為 set 自動變動
return None # setter 呢題 spec 本身唔回東西;None 唔係失敗,只係「做完就算」
if len(self.data) >= self.capacity: # 雪櫃已滿;想塞新貨之前一定要先騰位
self._evict_lru() # 交俾 LRU helper 揀犧牲品;主流程只負責入新貨
self.data[key] = {"value": value, "access_count": 0, "last_access": 0, "expiry": None} # 新貨入櫃;一開始當未有人拎過,所以 access 同 last_access 都係初始值
def get(self, key): # 拎貨出雪櫃;成功拎到就順手記一筆「呢件貨啱啱有人掂過」
if key not in self.data: return None # 呢格冇貨;caller 一見 None 就知今次白行一趟
self.data[key]["access_count"] += 1 # 每拎一次就喺點貨簿加一筆;之後統計熱門貨靠佢
return self.data[key]["value"] # 真正交返櫃入面嗰件貨;caller 攞到就代表今次成功拎貨
set 已存在 = UPDATE,唔 evict
set 新 key + 滿 = evict 再 store
get 加 access_count(set 唔加)
last_access=0 for plain set
def set_with_ttl(self, key, value, timestamp, ttl): # 入一件會過期嘅貨;同普通 set 一樣,但順手貼上失效時間
# 同 set 但有 expiry = ts+ttl, last_access = timestamp
def _evict_lru(self): # 雪櫃爆位時揀一件最耐冇人掂過嘅貨踢走,騰位畀新貨
# 第一段:先由全櫃貨品入面揀出「最冷門」候選人
lru_key = None # 暫時未有犧牲品;先留空位,等第一件貨入圍
for k, entry in self.data.items(): # 逐件貨巡;逐個比較邊件最耐冇人拎過
if lru_key is None: lru_key = k; continue # 第一件見到嘅貨先暫時坐住候選席
if entry["last_access"] < self.data[lru_key]["last_access"]: # 呢件貨更耐冇人理;即係更應該畀人踢出櫃
lru_key = k # 更新犧牲品;由而家起呢件貨最應該讓位
elif entry["last_access"] == self.data[lru_key]["last_access"] and k < lru_key: # 如果一樣舊,就用 key 字母序當 tie-break
lru_key = k # 揀字母序較前嗰件;等答案穩定,唔會今次踢 A 下次踢 B
# 第二段:揀完候選人之後,真係將佢請出雪櫃
del self.data[lru_key] # 真正將最冷門嗰件貨踢出櫃,騰返一格位俾新貨入
self.eviction_history.append(lru_key) # 順手記低邊件貨今次被犧牲;之後查 eviction 歷史先有跡可尋
LRU = last_access 最細嗰個
過期嘅 entry 仲佔位,可以被 evict
get_at(key, ts) 更新 last_access = ts
# L5: lock per key
# L6: ALL-SLEEP — 唔理條 key 仲喺唔喺櫃入面,人人都要先排隊 + sleep 先知結果
async def sync_cache(self, timestamp, sync_requests, max_concurrent): # 批量同步 cache 狀態;呢題唔係 fail-fast,係人人都要入閘行完整個流程
sem = asyncio.Semaphore(max_concurrent) # 准考證 N 張;同一時間只畀 N 張同步單一齊出街
# 第一步:定義一張同步單點樣過閘、等候、再回報結果
async def do_sync(req): # 每張單都代表「幫某條 key 同步一次」
async with sem: # 唔理成功定失敗,都要先攞位入閘;呢個就係 all-sleep 精神
await asyncio.sleep(0.01) # 模擬外部同步服務真係要時間;就算最終冇貨都照樣要等
return req["key"] in self.data # sleep 完先睇條 key 仲喺唔喺櫃;True/False 就係今次同步單嘅答案
# 第二步:先收齊所有同步單,再一次過交畀 gather 開跑
tasks = [] # 待同步單清單;每項都係一條 key 嘅 coroutine
for r in sync_requests: # 順住 input 收單,保留 caller 想要嘅輸出次序
tasks.append(do_sync(r)) # 每張單都先掛起,等 gather 一次過放出去排隊
results = await asyncio.gather(*tasks) # gather 等晒所有單做完;真正同時跑幾多張由 semaphore 控住
return list(results) # 交返同 input 對齊嘅結果 list;caller 一眼見到每條 key 最後有冇同步到
ALL-SLEEP pattern
茶餐廳廚房嘅單板系統。老闆寫單上 board,廚師嚟攞單做。要寫個 class 模擬。
想像茶餐廳廚房嘅單板(queue):
┌──────────────────────────────────────┐
│ 單1: 菠蘿包 priority=3 │
│ 單2: 奶茶 priority=1 │
│ 單3: 凍檸茶 priority=5 ← VIP 急住要 │
│ 單4: 西多士 priority=3 │
└──────────────────────────────────────┘
每張單有:
task_id = 單嘅編號("單1")
priority = 幾急(數字大 = 越急)
廚師規則:
1. 攞單嗰陣,永遠攞最大 priority 嗰張
2. 兩張一樣大 priority?攞先寫嘅嗰張(FIFO)
# 例:上面個單板攞嘅順序
priority 5: 單3 ← 最大 → 第一個攞
priority 3: 單1, 單4 ← 一樣大 → 單1 先寫 → 第二個攞
priority 3: 單4 ← 第三個攞
priority 1: 單2 ← 最細 → 最後攞
# 所以攞嘅順序:
# 單3 → 單1 → 單4 → 單2
# 後面 level 加多啲嘢:
# L2 加 status:QUEUED → PROCESSING → COMPLETED / FAILED
# L3 加 retry:失敗可以重試
# L4 加 dependencies:呢份工要等其他工做完先做得
# L5 加 worker pool:N 個廚師同時做
# L6 加 dispatch:完成嘅單派去外部系統
import asyncio
class TaskQueue:
def __init__(self):
self.tasks = {} # L1 所有單
self.add_counter = 0 # L1 寫到第幾張
self.max_retries = 0 # L3 加
self.base_backoff_ms = 0 # L3 加
self._lock = asyncio.Lock() # L5 加
self.tasks = {
"單1": {"priority": 3, "added_order": 1},
"單2": {"priority": 1, "added_order": 2},
"單3": {"priority": 5, "added_order": 3},
}
# 第一層 key = 單嘅編號
# 第二層係個 dict,存呢張單嘅 info
單編號 │ priority │ added_order
────────┼──────────┼─────────────
單1 │ 3 │ 1
單2 │ 1 │ 2
單3 │ 5 │ 3
self.tasks = {} # 上面個 table,開頭係空
self.add_counter = 0 # 寫到第幾張,開頭 0
# 後面 L3 加 retry 嗰陣每張單會多 attempt / retry_time field
# L4 加 dependencies 嗰陣會多 dependencies field
# L1 用唔住嗰啲,所以開頭簡單啲
# 檢查一份工嘅 dependencies 係咪全部做完
def _deps_met(self, task_id): # L4 用:dependencies 全部 COMPLETED?
deps = self.tasks[task_id]["dependencies"] # 攞呢張單嘅前置工作清單
for dep_id in deps: # 逐個前置單睇
if dep_id not in self.tasks: # dep 唔存在 → 未 met
return False # 失敗就返 False;caller 可以當今次要求冇落地
if self.tasks[dep_id]["status"] != "COMPLETED": # 睇下 dep 做完未
return False # dep 未完成 → 未 met
return True # 全部 COMPLETED → met
def _get_next_ready_task_id(self): # 搵最高 priority + FIFO 嘅 ready task
candidates = [] # 裝符合條件嘅候選人
for tid, task in self.tasks.items(): # 逐張單睇
if task["status"] != "QUEUED": # 唔係 QUEUED → 跳過
continue # 下一個
if not self._deps_met(tid): # deps 未做完 → 跳過
continue # 下一個
priority = task["priority"] # 攞 priority
order = task["added_order"] # 攞 added_order
candidates.append((-priority, order, tid)) # 砌 tuple
if not candidates: # 冇候選 → 空 string
return "" # 答空
candidates.sort() # 默認 sort:細嘅 tuple 排先
first = candidates[0] # 排第一嘅 tuple
return first[2] # tuple 第 3 個 element = tid
_deps_met(task_id) — 檢查呢份工嘅前置工作做晒未
例如 task_c 嘅 dependencies = ["task_a", "task_b"]
_deps_met("task_c"):
task_a status = "COMPLETED" → OK
task_b status = "PROCESSING" → 未完成!→ return False
即係 task_c 仲做唔住
dependencies = [] → 冇前置 → 直接 return True
dep_id 唔存在喺 self.tasks → 當未 met(唔係 error)
逐份工睇 → 要 QUEUED + deps met → 放入 candidates → sort by (-priority, added_order) → 攞第一個
例子:
tasks 入面有 3 個 QUEUED + deps met:
task_x: priority=5, added_order=1
task_y: priority=10, added_order=3
task_z: priority=5, added_order=2
sort key = (-priority, added_order)
task_y: (-10, 3) ← 最細(-10 最細 = priority 最高)
task_x: (-5, 1) ← 第二
task_z: (-5, 2) ← 第三(同 priority,但 added_order 大)
return "task_y"(priority 最高嗰個)
如果 task_x 同 task_z priority 一樣:
task_x added_order=1 < task_z added_order=2
→ task_x 先做(FIFO)
task = 工作 priority = 邊個先做 FIFO = 先入先出 queue = 排隊
def add_task(self, timestamp, task_id, priority): # 老闆攞住張新單嚟,叫你登記入單板
if task_id in self.tasks: # 望下單板,呢個單號之前有冇人登記過?
return False # 重複咗 → 拒收,return False 畀老闆知
self.add_counter += 1 # 全局計數器 +1(你係今日第幾張單,FIFO 排序靠呢個)
self.tasks[task_id] = { # 喺單板度開一格,記低呢張單嘅資料
"priority": priority, # 重要程度:數字越大越緊急
"status": "QUEUED", # 而家狀態:「排緊隊」(L2 開始先用呢個 field)
"added_order": self.add_counter, # 記低 counter 嘅值(priority 一樣嗰陣靠呢個分先後)
"attempt": 0, # 試過幾多次(L3 retry 先用,L1 暫時 0)
"retry_time": None, # 下次幾時可以再試(L3 用)
"dependencies": [], # 等邊啲單做完先到我(L4 DAG 用)
} # dict 結尾
return True # 成功登記
def get_next_task(self, timestamp): # 廚師埋嚟攞下一張單做
tid = self._get_next_ready_task_id() # helper 幫你揀:QUEUED 中 priority 最高、最早入嘅
if not tid: # helper return None → 單板冇嘢做
return "" # 約定俗成 return 空 string(唔係 None)
self.tasks[tid]["status"] = "PROCESSING" # 揀到 → mark 做緊(L2+ state machine 開始)
return tid # 將單號交畀廚師
def get_task_status(self, timestamp, task_id): # 查呢張單而家做到邊
if task_id not in self.tasks: # 單板根本冇登記過呢張單
return "" # 唔存在 → 空 string(同 spec 約定)
return self.tasks[task_id]["status"] # 喺 dict 攞返 "QUEUED" / "PROCESSING" / ...
def __init__(self):
self.tasks = {}
self.add_counter = 0
self.tasks = { 單板(key = 單號)
"單1": {
"priority": 3, 重要程度(大 = 急)
"added_order": 1, 登記順序(FIFO 排序靠呢個)
},
}
self.add_counter = 1 全局計數器(寫到第幾張)
self.max_retries = 0 最多 retry 幾次(L3 先加)
self.base_backoff_ms = 0 基本等待時間(L3 先加)
self._lock = asyncio.Lock() 全局鎖(L5 先加)
def add_task(self, timestamp, task_id, priority):
if task_id in self.tasks:
return False
self.add_counter += 1
self.tasks[task_id] = {
"priority": priority,
"added_order": self.add_counter,
}
return True
有人話:「我想加張單,編號 "單1",priority = 3。」
def add_task(self, timestamp, task_id, priority):
if task_id in self.tasks: # 單板上面有冇呢個編號?
return False # 有 → return False,唔加
self.add_counter += 1 # counter +1(你係第 N 個寫嘅)
self.tasks[task_id] = { # 加入單板
"priority": priority,
"added_order": self.add_counter,
}
return True
priority 一樣嗰陣,先寫嘅嗰張先做(FIFO)。dict 入面記住寫嘅順序唔可靠(sort 一 sort 就亂),所以自己加個 counter 記低「我係第幾個寫」。
真正做嘢嘅係 helper _get_next_ready_task_id。
逐張單睇 → 只攞 QUEUED 嘅 → sort → 攞第一個 → 轉做 PROCESSING → return 個 task_id。
sort key 係一個 tuple,逐個比較
加負號 = 大嘅排先
唔加 = 細嘅排先(默認)
例:(-priority, added_order)
= priority 大嘅排先;一樣 priority 嗰陣,added_order 細嘅排先(FIFO)
get_task_status(103, "單1") → "QUEUED"
get_task_status(104, "unknown") → ""(唔存在)
task = 工作 start = 開始做 complete = 做完 fail = 做唔到 status = 而家咩狀態
# start_task = 你指定要做邊張;get_next_task = 系統用 priority + FIFO 自動揀
# 兩個都係 QUEUED → PROCESSING,分別在「邊個揀單」
def start_task(self, timestamp, task_id): # 廚師話「我想做呢張單」
if task_id not in self.tasks: # 單板根本冇呢張單 → 拒
return False # 唔存在 → 拒
if self.tasks[task_id]["status"] != "QUEUED": # 只可以由 QUEUED 起步(唔係呢個 status 就唔畀做)
return False # 已經 PROCESSING/COMPLETED/FAILED → 唔可以再 start
self.tasks[task_id]["status"] = "PROCESSING" # 改 status:排隊 → 做緊
return True # start 成功
def complete_task(self, timestamp, task_id): # 廚師話「我做完啦」
if task_id not in self.tasks: # 單板冇登記 → 拒
return False # 唔存在 → 拒
if self.tasks[task_id]["status"] != "PROCESSING": # 一定要由 PROCESSING 開始至可以 complete
return False # 仲喺 QUEUED 都未開始做,complete 乜嘢?
self.tasks[task_id]["status"] = "COMPLETED" # 改 status:做緊 → 做完
return True # complete 成功
def fail_task(self, timestamp, task_id): # 廚師做唔到
if task_id not in self.tasks: # 單板冇登記 → 拒
return False # 唔存在 → 拒
if self.tasks[task_id]["status"] != "PROCESSING": # 只可以 fail 緊做嘅單
return False # 唔係 PROCESSING → 拒
self.tasks[task_id]["status"] = "FAILED" # 改 status:做緊 → 失敗(L3 會喺呢度加 retry 邏輯)
return True # fail 成功
def get_queue_length(self, timestamp): # 數有幾多張單仲喺度等做
count = 0 # 從 0 計起
for task in self.tasks.values(): # 行一次成個單板(只攞 value,唔需要 key)
if task["status"] == "QUEUED": # 只計 QUEUED(PROCESSING/COMPLETED/FAILED 唔計)
count += 1 # 中咗就 +1
return count # 答最終總數
def get_tasks_by_status(self, timestamp, status): # 列出指定 status 嘅所有單號
result = [] # 開個空 list 等住裝
for tid, task in self.tasks.items(): # 逐張單睇(key 同 value 都拎)
if task["status"] == status: # status 啱叫嗰個 → 入 list
result.append(tid) # 入 list
result.sort() # 按字母排(output 要 deterministic)
return result # 答返個 list
add_task(L1 寫嘅)要改:
inner dict 加 "status": "QUEUED" field
get_next_task(L1 寫嘅)要改:
攞咗單之後唔好 delete,改 status = "PROCESSING"
def __init__(self):
self.tasks = {}
self.add_counter = 0
self.tasks = { 單板(key = 單號)
"單1": {
"priority": 3, 重要程度
"status": "QUEUED", 而家狀態(L2 加)
"added_order": 1, 登記順序
},
"單2": {
"priority": 5,
"status": "PROCESSING", 做緊
"added_order": 2,
},
}
self.add_counter = 2 全局計數器
self.max_retries = 0 最多 retry 幾次(L3 先加)
self.base_backoff_ms = 0 基本等待時間(L3 先加)
self._lock = asyncio.Lock() 全局鎖(L5 先加)
def add_task(self, timestamp, task_id, priority):
if task_id in self.tasks:
return False
self.add_counter += 1
self.tasks[task_id] = {
"priority": priority,
"status": "QUEUED", ← L2 加:新單一律 QUEUED
"added_order": self.add_counter,
}
return True
而家每張單有完整嘅一生(status lifecycle)。L1 個 get_next_task 攞咗單之後個 task 直接消失,L2 改做轉去 PROCESSING(廚師做緊)。加 4 個新 method 處理狀態切換 + 2 個 query method。
兩個都係轉 QUEUED → PROCESSING,但邊個揀單唔同:
get_next_task() ← 系統幫你揀
廚師話「畀我做嘢」
系統用 priority + FIFO 自動揀最高優先嗰張
return 個 task_id 畀你
start_task(task_id) ← 你指定
老闆話「廚師你做單3」
唔理 priority,指定做某張
return True / False(成唔成功)
[QUEUED] ──── start_task ────► [PROCESSING]
│
┌─────────┴─────────┐
│ │
complete_task fail_task
│ │
▼ ▼
[COMPLETED] [FAILED]
# 規則:唔可以跳級
# 要 QUEUED 先可以 start_task
# 要 PROCESSING 先可以 complete_task 或 fail_task
# COMPLETED / FAILED 已經係終點,唔再轉
1. check task 存在
2. check 而家 status 啱唔啱
3. 轉 status
4. return True
# 三個 method 嘅 code 99% 一樣,淨係改:
# - check 嘅 status 名(QUEUED / PROCESSING)
# - 改去嘅 status 名(PROCESSING / COMPLETED / FAILED)
# 開頭:
self.tasks = {
"單1": {"priority": 5, "added_order": 1, "status": "QUEUED"},
"單2": {"priority": 3, "added_order": 2, "status": "QUEUED"},
}
get_queue_length() → 2
# 廚師 A: get_next_task()
# → 攞 priority 最大 → "單1"
# → 單1.status = "PROCESSING"
# → return "單1"
get_queue_length() → 1(單2 仲係 QUEUED)
get_tasks_by_status("PROCESSING") → ["單1"]
# 廚師 A: complete_task("單1")
# → 單1.status = "PROCESSING" ✅
# → 轉做 COMPLETED → return True
# 廚師 A: complete_task("單1")(試吓再 complete 一次)
# → 單1.status = "COMPLETED" ≠ "PROCESSING"
# → return False
# 廚師 B: get_next_task() → "單2" → PROCESSING
# 廚師 B: fail_task("單2")(做唔到)
# → 單2.status = "PROCESSING" ✅
# → 轉做 FAILED → return True
get_tasks_by_status("COMPLETED") → ["單1"]
get_tasks_by_status("FAILED") → ["單2"]
get_queue_length() → 0
L2 default: max_retries = 0 → 直接 FAILED
L3 加:max_retries > 0 + attempt < max_retries → RETRY_SCHEDULED
retry = 再試 backoff = 等幾耐先再試 exponential = 每次等耐啲 dead letter = 放棄咗嘅工作
def configure_retry(self, max_retries, base_backoff_ms): # 設返 retry 嘅規矩:最多可以再試幾多次,同埋每次要等幾耐
self.max_retries = max_retries # 記低最多可以 retry 幾多次
self.base_backoff_ms = base_backoff_ms # 記低基本要等嘅時間(之後會倍翻)
# ─────────── fail_task L3 改版 ───────────
# 失敗咗唔係即刻當佢死,先睇下佢仲有冇機會再試
def fail_task(self, timestamp, task_id): # 做唔到(PROCESSING→FAILED / RETRY)
if task_id not in self.tasks: # 如果根本冇呢張單,就 return False
return False # 單唔存在 → 拒
if self.tasks[task_id]["status"] != "PROCESSING": # 如果張單而家唔係做緊,就 return False(即係冇人做緊邊度嚟 fail)
return False # 唔係做緊 → 拒
task = self.tasks[task_id] # 將張單攞出嚟做個 shortcut,咁下面寫嘢短啲
# ↓↓↓ L3 新加:retry 判斷 ↓↓↓
# 如果佢可以 retry 嘅次數係大過零,同埋佢試咗嘅次數係細過最大可以 retry 嘅次數
if self.max_retries > 0 and task["attempt"] < self.max_retries: # 仲有 retry 次數剩
backoff = self.base_backoff_ms * (2 ** task["attempt"]) # 計返今次要等幾耐先可以再試(試得越多次就等得越耐)
task["retry_time"] = timestamp + backoff # 記低幾時可以再試,即係而家加上要等嘅時間
task["attempt"] += 1 # 將試咗嘅次數加 1
task["status"] = "RETRY_SCHEDULED" # 將張單嘅 status 改做「等緊重試」
else: # 冇 retry 或者用晒次數
task["status"] = "FAILED" # 如果根本冇 retry 或者用晒次數,就真係放棄佢,status 改做「失敗」
# ↑↑↑ L3 新加完 ↑↑↑
return True # fail 處理完成
def process_retries(self, timestamp): # 行一次睇下啲等緊重試嘅單,邊張等到時間就喚醒佢
count = 0 # 用嚟記低喚醒咗幾多張
for task in self.tasks.values(): # 逐張單睇
if task["status"] != "RETRY_SCHEDULED": # 如果張單唔係等緊重試,就跳過佢
continue # 跳過
if task["retry_time"] > timestamp: # 如果佢未到時間,都係跳過
continue # 跳過
task["status"] = "QUEUED" # 喚醒佢,將 status 改返做「排緊隊」
task["retry_time"] = None # 清走 retry_time(用唔住)
count += 1 # 喚醒數 +1
return count # return 一共喚醒咗幾多張
def get_dead_letter(self, timestamp): # 攞晒所有已經放棄咗嘅單嘅編號
result = [] # 用嚟裝結果
for tid, task in self.tasks.items(): # 逐張單睇
if task["status"] == "FAILED": # 如果張單係 FAILED,就加佢個編號入 result
result.append(tid) # 入 list
result.sort() # 將個 list 按字母排序
return result # 答返個 list
QUEUED 排緊隊(等廚師攞)
PROCESSING 做緊
COMPLETED 做完
FAILED 放棄咗(用晒 retry 或冇 retry)
RETRY_SCHEDULED ← L3 新加:失敗咗等緊重試
每次 call process_retries 就 for loop 一次成個 self.tasks,逐張單睇:
• 唔係 RETRY_SCHEDULED(即係 QUEUED、PROCESSING、COMPLETED、FAILED)→ skip
• 係 RETRY_SCHEDULED 但 retry_time 未到 → skip
• 係 RETRY_SCHEDULED 而且到時間 → 喚醒(轉返 QUEUED)
即使你有 100 張單但得 2 張係 RETRY_SCHEDULED,個 loop 都要行 100 次(睇晒每張)。其餘 98 張一句 if status != "RETRY_SCHEDULED": continue 就跳過。考試唔使擔心效率,brute force 夠用。
def __init__(self):
self.tasks = {}
self.add_counter = 0
self.max_retries = 0 ← L3 加
self.base_backoff_ms = 0 ← L3 加
self.tasks = { 單板(key = 單號)
"單1": {
"priority": 5, 重要程度
"status": "RETRY_SCHEDULED", 等緊重試(L3 新 status)
"added_order": 1, 登記順序
"attempt": 2, 試咗幾次(L3 加)
"retry_time": 350, 幾時可以再試(L3 加)
},
}
self.add_counter = 1 全局計數器
self.max_retries = 2 最多 retry 幾次(L3 加,configure_retry 設)
self.base_backoff_ms = 100 基本等待時間(L3 加,configure_retry 設)
self._lock = asyncio.Lock() 全局鎖(L5 先加)
def add_task(self, timestamp, task_id, priority):
if task_id in self.tasks:
return False
self.add_counter += 1
self.tasks[task_id] = {
"priority": priority,
"status": "QUEUED",
"added_order": self.add_counter,
"attempt": 0, ← L3 加:新單未試過
"retry_time": None, ← L3 加:未排 retry
}
return True
dependency = 前置工作 DAG = 工作之間嘅先後關係圖 blocked = 等緊前置工作做完
def add_task_with_deps(self, timestamp, task_id, priority, dependencies): # 加一張單,但呢張單要等其他單做完先做得
if task_id in self.tasks: # 如果已經有呢個編號,唔加
return False # 已存在 → 拒
self.add_counter += 1 # 入隊 counter +1
self.tasks[task_id] = { # 同 add_task 一樣,但多咗 dependencies
"priority": priority, # 優先度
"status": "QUEUED", # 初始排隊
"added_order": self.add_counter, # 入隊序號
"attempt": 0, # 未試過
"retry_time": None, # 未排 retry
"dependencies": list(dependencies), # 將 dependencies copy 一份入去(用 list() 防止外面改影響到呢度)
} # dict 結尾
return True # 成功加入
def get_ready_tasks(self, timestamp): # 攞晒「可以開始做」嘅單(排緊隊 + 前置工作都做完)
candidates = [] # 用嚟裝符合條件嘅單
for tid, task in self.tasks.items(): # 逐張單睇
if task["status"] != "QUEUED": # 唔係排緊隊嘅,跳過
continue # 跳過
if not self._deps_met(tid): # 前置工作未做完嘅,都跳過
continue # 跳過
priority = task["priority"] # 攞 priority 出嚟
candidates.append((-priority, tid)) # 砌 tuple:負 priority(大嘅排先),tid(字母排 tie-break)
candidates.sort() # 細嘅 tuple 排先
result = [] # 開個空 list 裝結果
for item in candidates: # 逐個 tuple 攞 tid(係第 2 個 element)
result.append(item[1]) # 攞 tuple 第 2 個 = tid
return result # 答返排好序嘅 list
def get_blocked_tasks(self, timestamp): # 攞晒「卡住嘅」單(排緊隊但前置工作未做完)
blocked = [] # 開個空 list 裝結果
for tid, task in self.tasks.items(): # 逐張單睇
if task["status"] != "QUEUED": # 唔係排緊隊嘅,跳過
continue # 跳過
if self._deps_met(tid): # 前置工作已經做完嘅,跳過(嗰啲叫 ready 唔叫 blocked)
continue # 跳過(佢哋係 ready 唔係 blocked)
blocked.append(tid) # 卡住嘅就加入 list
blocked.sort() # 按字母排序
return blocked # 答返排好序嘅 list
有啲單要等其他單做完先做得。例如「煮飯」要等「洗米」做完。L4 加 1 個新 method 攞依賴關係嘅單入嚟,加 2 個 query method(ready / blocked)。同時 start_task 同 get_next_task 要 check 依賴。
def __init__(self):
self.tasks = {}
self.add_counter = 0
self.max_retries = 0
self.base_backoff_ms = 0
self.tasks = { 單板(key = 單號)
"單1": {
"priority": 5, 重要程度
"status": "QUEUED", 而家狀態
"added_order": 1, 登記順序
"attempt": 0, 試過幾次
"retry_time": None, 下次幾時可以再試
"dependencies": [], 冇前置工作(L4 加)
},
"單2": {
"priority": 5,
"status": "QUEUED",
"added_order": 2,
"attempt": 0,
"retry_time": None,
"dependencies": ["單1"], 要等單1 做完(L4 加)
},
"單3": {
"priority": 5,
"status": "QUEUED",
"added_order": 3,
"attempt": 0,
"retry_time": None,
"dependencies": [],
},
"單4": {
"priority": 5,
"status": "QUEUED",
"added_order": 4,
"attempt": 0,
"retry_time": None,
"dependencies": ["單2", "單3"], 要等單2 同 單3
},
}
self.add_counter = 4 全局計數器
self.max_retries = 2 最多 retry 幾次
self.base_backoff_ms = 100 基本等待時間
self._lock = asyncio.Lock() 全局鎖(L5 先加)
def add_task(self, timestamp, task_id, priority):
if task_id in self.tasks:
return False
self.add_counter += 1
self.tasks[task_id] = {
"priority": priority,
"status": "QUEUED",
"added_order": self.add_counter,
"attempt": 0,
"retry_time": None,
"dependencies": [], ← L4 加:default 冇前置
}
return True
def add_task_with_deps(self, timestamp, task_id, priority, dependencies):
if task_id in self.tasks:
return False
self.add_counter += 1
self.tasks[task_id] = {
"priority": priority,
"status": "QUEUED",
"added_order": self.add_counter,
"attempt": 0,
"retry_time": None,
"dependencies": list(dependencies), ← 唯一同 add_task 唔同:用傳入嘅 list
}
return True
攞嗰張 task 嘅所有 dependency 嘅 ID(即係嗰張 task 之前要等幾多張單做完),跟住逐張睇:
• 如果根本冇呢張單(連 add 都未 add)→ 未做 → 未 match
• 如果有張單但仲未 COMPLETED(即係未做完)→ 都係唔 match
• 全部 dependency 都係 COMPLETED 嘅 → 先 match,呢張單可以開始做
def _deps_met(self, task_id):
deps = self.tasks[task_id]["dependencies"] # 攞呢張單嘅前置清單
for dep_id in deps: # 逐張前置工作 check
if dep_id not in self.tasks: # 連張單都未存在 → 未做 → 未 match
return False
if self.tasks[dep_id]["status"] != "COMPLETED": # 有張單但未做完 → 唔 match
return False
return True # 全部前置都做晒 → match,可以開始
add_task_with_deps(1, "deploy", 5, ["build"])
但 "build" 仲未 add 入 self.tasks
→ _deps_met("deploy") check "build" → 連張單都未存在 → False
→ deploy 永遠 blocked,直到有人 add 同 complete "build"
Spec 允許先加要等嘅單,再加被等嘅單
worker = 打工仔 pool = 一班打工仔同時做嘢 lock = 鎖住唔畀爭 gather = 同時跑
run_workers(timestamp, num_workers) simulates num_workers workers processing tasks concurrently. // 模擬 N 個打工仔同時做嘢
Each worker loops: get next ready task → mark PROCESSING → await asyncio.sleep(0.01) → mark COMPLETED. // 每個 worker loop:攞 task → 做 → 完成
Workers stop when no more ready tasks available. // 冇嘢做就收工
Use asyncio.Lock to protect queue access (two workers must not grab the same task). // lock 防止搶同一份工
Use a lock-protected shared list to track completion order. // 記低完成順序
Return list of completed task_ids in the order they finished. // return 完成順序
# Worker Pool pattern — N 個打工仔,自己嚟 queue 攞工作做
async def run_workers(self, timestamp, num_workers): # 開 N 個打工仔一齊搶 ready 單;同一張單只可以畀一個人攞走
completed_order = [] # 記完成順序
async def worker(): # 一個打工仔嘅行為
while True: # 不停做,直到冇嘢做
async with self._lock: # 攞准考證(鎖住 self.tasks)
tid = self._get_next_ready_task_id() # 揀下一個 ready task
if not tid: # 冇 ready 嘅單
return # 收工
self.tasks[tid]["status"] = "PROCESSING" # 改做 PROCESSING
await asyncio.sleep(0.01) # 出咗鎖,喺度做嘢
async with self._lock: # 再攞准考證
self.tasks[tid]["status"] = "COMPLETED" # 改做 COMPLETED
completed_order.append(tid) # 順便 append(用同一張准考證)
workers = [] # 開個空 list 等陣裝 worker 對象
for _ in range(num_workers): # loop num_workers 次(_ 即係「呢個變數我唔用」)
workers.append(worker()) # worker() return 一個 coroutine 對象,未開始行
await asyncio.gather(*workers) # *workers 拆開 list,gather 同時開動全部 worker 等做完
return completed_order # 全部 worker 收工後 return 個 list
L5/L6 Hashring = Gather pattern
每個 op 做一次就完
asyncio.gather(*[do(op) for op in list])
L5 TaskQueue = Worker Pool pattern
每個 worker 係 while True loop
worker 做完一個 task → 再攞下一個 → 再做...
直到冇嘢做先 return
想像:
Gather = 10 個快遞員,每人送 1 個包裹
Worker Pool = 3 個打工仔,不停由工作堆攞嘢做
asyncio.Lock() 其實唔係真係鎖住 self.tasks。佢只係一個「准入證」。
self._lock = asyncio.Lock()
async with self._lock:
# 入到嚟呢度 = 攞到准入證
# 任何其他 coroutine 想 async with self._lock 都要等
即係:
廚師 A: async with self._lock: ← 攞到准入證
讀 self.tasks
改 self.tasks
← 出去 = 還准入證
廚師 B: async with self._lock: ← 想攞,但 A 仲未還
(等緊...)
← A 還咗,B 攞到
讀 self.tasks
改 self.tasks
Lock 唔係鎖住 dict 本身,係鎖住「block 入面段 code」。
我哋約定凡係要碰 self.tasks 嘅 code,都用 async with self._lock。咁就保證任何時間只有一個 coroutine 喺度改 self.tasks。
如果你有一段 code 改 self.tasks 但冇用 lock,咁佢就無視個鎖,照樣會撞車。
所以 self._lock 個名只係表示「鎖緊 self(嘅 tasks)」,但實際上 Python 唔知道、唔強制。係我哋寫 code 嘅人自己跟規矩。
對比一個廚房嘅鎖:
• 鎖一條鎖匙 = 唔阻止你直接踢爛門入廚房
• 但大家有 convention:要入廚房嘅都用鎖匙
→ 結果就係一次只有一個人入
• 如果有人踢爛門 → 規矩破咗,會撞車
Python lock 一樣 — 大家跟規矩用 lock,先 work
workers = [] # 開個空 list 等陣裝 worker 對象
for _ in range(num_workers): # loop num_workers 次
workers.append(worker()) # worker() return coroutine(未行)
# 3 次循環之後 workers 入面有 3 個 coroutine
# 但冇一個喺度行(待命中):
# workers = [
# <coroutine object worker>, # 廚師 1(待命)
# <coroutine object worker>, # 廚師 2(待命)
# <coroutine object worker>, # 廚師 3(待命)
# ]
await asyncio.gather(*workers)
# *workers 將 list 拆做 individual args
# 等於:await asyncio.gather(worker(), worker(), worker())
# gather = 全部同時開動,等晒佢哋做完先返回
# 到呢一刻先真正開始有 3 個廚師同時做嘢
return completed_order # 全部 worker 收工 → 返回個 list
await worker() # 開動 1 個廚師,等佢做完
await worker() # 開動第 2 個(等第 1 個做完先開始)
await worker() # 開動第 3 個...
# 呢個係 serial(排隊做),唔係 concurrent(同時做)
# 冇 worker pool 嘅 effect
# 要用 gather 先係真正同時做
tasks = []
for _ in range(num_workers):
tasks.append(worker())
await asyncio.gather(*tasks)
# 同上面效果一樣,但分開寫易讀
async with self._lock:
tid = ... ← 攞 task(快)
status = PROCESSING
await asyncio.sleep(0.01) ← 做嘢(慢)
↑ 呢個唔喺 lock 入面!
因為做嘢嗰陣唔應該鎖住 queue
其他 worker 要繼續攞 task
如果 sleep 喺 lock 入面:
一次只有 1 個 worker 做嘢 = 冇意義
同 serial 冇分別
dispatch = 派工作去外部 semaphore = 限制同時幾多個 fail-fast = 唔合格即走唔入 sem
dispatch_external(timestamp, task_ids, max_concurrent) dispatches completed tasks to an external system. // 派完成嘅 task 去外部
For each task_id: check it exists and is "COMPLETED". // 要存在 + COMPLETED
If not COMPLETED, return False immediately without acquiring semaphore. // 唔合格 → False,唔入 sem
If COMPLETED, acquire semaphore, await asyncio.sleep(0.01), mark as "DISPATCHED". // 合格 → sem + sleep + DISPATCHED
Use asyncio.Semaphore(max_concurrent) to limit concurrent dispatches. // sem 限制同時幾多個
Return a list of booleans (one per input task_id). // return [True/False, ...]
async def dispatch_external(self, timestamp, task_ids, max_concurrent): # 並發 dispatch(semaphore + fail-fast)
sem = asyncio.Semaphore(max_concurrent) # 准考證 N 張
results = [None] * len(task_ids) # 預填 None,等陣按 index 改
async def dispatch_one(index, tid): # 做一次 dispatch
# ↓↓↓ Fail-fast check(唔入 sem)↓↓↓
if tid not in self.tasks: # 張單根本唔存在
results[index] = False # mark fail
return # 即走,唔入 sem
if self.tasks[tid]["status"] != "COMPLETED": # 唔係 COMPLETED
results[index] = False # mark fail
return # 即走,唔入 sem
# ↑↑↑ Fail-fast check 完 ↑↑↑
async with sem: # 合格先入 sem
await asyncio.sleep(0.01) # 模擬 API call
self.tasks[tid]["status"] = "DISPATCHED" # 改 status
results[index] = True # mark 成功
tasks_to_run = [] # 開 list 裝 coroutine
for i, tid in enumerate(task_ids): # i = index, tid = task_id
tasks_to_run.append(dispatch_one(i, tid)) # 為每個 task 造一個 coroutine
await asyncio.gather(*tasks_to_run) # 同時開動全部
return results # 答返結果 list
工作做完之後(status=COMPLETED)要派出去外部系統(例如外部 API、webhook)。但外部系統有 rate limit,唔可以一次過派太多,所以要限制同時最多 N 個派緊。
L5 用 asyncio.Lock():
一次只有 1 個 coroutine 可以入 block
async with self._lock: # 一次得 1 個
...
L6 用 asyncio.Semaphore(N):
一次有 N 個 coroutine 可以入 block
sem = asyncio.Semaphore(3)
async with sem: # 一次得 3 個
...
# 想像:
# Lock = 1 個 toilet(一次 1 個人)
# Semaphore(3) = 3 個 toilet(一次最多 3 個人)
# 超過 N 個想入 → 排隊等
Spec 講:「冇存在 / 唔係 COMPLETED → return False without acquiring semaphore」
即係 check 嗰陣發現唔合格,唔好攞 semaphore,即刻 return False。
async def dispatch_one(index, tid):
# ↓ 先 check,唔合格即走
if tid not in self.tasks:
results[index] = False
return # 即走,唔入 sem
if self.tasks[tid]["status"] != "COMPLETED":
results[index] = False
return # 即走,唔入 sem
# ↓ 合格先入 sem
async with sem:
await asyncio.sleep(0.01) # 模擬 API call
self.tasks[tid]["status"] = "DISPATCHED"
results[index] = True
唔合格嘅唔應該佔住個 sem slot
假設 max_concurrent=2,5 個 task:[1, 2(失敗), 3, 4(失敗), 5]
❌ 唔 fail-fast(全部都 sem + sleep):
Time 0: 1 + 2 入 sem,sleep 0.01
Time 0.01: 3 + 4 入 sem,sleep 0.01
Time 0.02: 5 入 sem,sleep 0.01
Total: 0.03 秒
✅ Fail-fast(唔合格即走):
Time 0: 1 + 3 入 sem,2 + 4 即走 fail
Time 0.01: 5 入 sem
Time 0.02: 5 ✅
Total: 0.02 秒(快咗,因為 2 同 4 冇佔位)
考試 timing test 會 check 呢個
寫成 all-sleep 會超時
1. Semaphore(N) 限制同時 N 個(唔係 1 個)
2. Fail-fast:check 喺 sem 之前,唔合格即 return
3. results = [None] * len(task_ids)
預填 list,按 index 改(preserve order)
4. enumerate(task_ids) 同時攞 index 同 value
5. 新 status:DISPATCHED(已派去外部)
工廠流水線。每件貨喺幾個 step 之間順序行,做完一步先去下一步。要寫個 class 模擬。
想像工廠流水線(workflow):
┌────────────────────────────────────────────┐
│ workflow "order123": │
│ step1: 落單 → COMPLETED │
│ step2: 入袋 → COMPLETED │
│ step3: 出單 → PROCESSING ← 廚師做緊 │
│ step4: 送貨 → PENDING ← 等緊上一步 │
│ step5: 收錢 → PENDING │
└────────────────────────────────────────────┘
一個 workflow 入面有多個 step:
workflow_id = 流水線編號("order123")
step_id = 流水線上面第幾個工序("step3")
step_name = 工序嘅名("出單")
規矩:
1. step 一定要按插入順序行(先入先做)
2. 上一步做完 → 下一步先可以開始
3. 中間有任何一步失敗 → 全部回滾
同 TaskQueue 比:
TaskQueue:每張單獨立,揀邊張睇 priority + FIFO
單1, 單2, 單3 之間冇關係,邊張 priority 大邊張先做
Workflow:step 之間有順序依賴
step1 → step2 → step3 → step4 → step5
step1 未做完,step2 唔開始
插入順序 = 執行順序
每個 level 加嘅嘢:
L1 加 CRUD(create_workflow, add_step, ...)
L2 加 get_progress(formatted string)+ list_workflows(2 種 sort)
L3 加 state machine:PENDING → READY → PROCESSING → COMPLETED|FAILED
+ 自動 trigger:complete 一步,下一步 PENDING → READY
L4 加 fail_step + rollback:所有 COMPLETED 回退做 PENDING
L5 加 batch_operations:async + 每個 workflow 獨立 lock
L6 加 execute_steps:接受 external_call function + semaphore
import asyncio
from collections import defaultdict
class WorkflowEngine:
def __init__(self):
self.workflows = {} # L1: wf_id → [(step_id, step_name), ...]
self.step_status = {} # L1: (wf_id, step_id) → status
self.history = defaultdict(list) # L4 加:wf_id → ["step: OLD->NEW"]
self.locks = defaultdict(asyncio.Lock) # L5 加:wf_id → asyncio.Lock
self.workflows = {
"order123": [("step1", "落單"), ("step2", "入袋"), ("step3", "出單")],
"order456": [("a", "Aaa"), ("b", "Bbb")],
}
# self.workflows[wf_id] 係一個 list of tuple
# tuple = (step_id, step_name)
# list 順序 = 插入順序 = 執行順序
self.step_status = {
("order123", "step1"): "COMPLETED",
("order123", "step2"): "COMPLETED",
("order123", "step3"): "PROCESSING",
("order456", "a"): "PENDING",
("order456", "b"): "PENDING",
}
# key 係 tuple (workflow_id, step_id)
# value 係 status string
可以擺,例如 [(step_id, step_name, status), ...]
但每次要改 status 都要 rebuild 個 tuple
用 separate dict(self.step_status)改 status 直接:
self.step_status[(wf_id, step_id)] = "READY"
簡單啲
# Helper 1: _record — 寫一行 state change 入 history(state machine + history 嘅 generic pattern)
def _record(self, workflow_id, step_id, old_status, new_status): # 由 L1 開始就一直 call,L4 先 expose
entry = step_id + ": " + old_status + "->" + new_status # 砌做 "step1: PENDING->READY" 形式
self.history[workflow_id].append(entry) # defaultdict(list) 自動造個 list
# Helper 2: _set_status — 集中嘅 status 變更入口(保證每次轉都記 history)
def _set_status(self, workflow_id, step_id, new_status): # 凡係要轉 status 都行呢度
key = (workflow_id, step_id) # step_status dict 嘅 tuple key
old_status = self.step_status[key] # 先攞舊 status(等陣 record 要用)
self.step_status[key] = new_status # 改新 status
self._record(workflow_id, step_id, old_status, new_status) # 順手寫入 history
# Helper 3: _process_triggers — auto-trigger 下一個 ready step(spec 明文要求)
def _process_triggers(self, workflow_id): # 每個 public method 完之前都 call 一次
if workflow_id not in self.workflows: # workflow 唔存在就 quietly return
return # 唔存在就靜靜走
steps = self.workflows[workflow_id] # 攞返條 step list(保留插入順序)
found_completed = False # flag:scan 到目前為止有冇見過 COMPLETED
for step_id, step_name in steps: # 順住插入順序逐個 step 睇
key = (workflow_id, step_id) # 砌 tuple key
status = self.step_status[key] # 查呢個 step 嘅 status
if status == "COMPLETED": # 上一個做完
found_completed = True # 開 flag
elif status == "PENDING" and found_completed: # PENDING + 上一個 COMPLETED → 應該 ready
self._set_status(workflow_id, step_id, "READY") # 自動推上 READY(透過 _set_status 順手記 history)
break # 一次只推一個,唔好連環推
_record(wf, step, old, new)
將狀態轉變寫做 "step1: PENDING->READY" 入 history
_set_status(wf, step, new)
凡係改 status 都行呢個(保證一定記 history)
_process_triggers(wf)
每個 public method 尾段 call
將「上一個 COMPLETED 嘅下一個 PENDING」推上 READY
workflow = 流水線 step = 一個工序 PENDING = 排緊隊 CRUD = create / read / update / delete
def create_workflow(self, workflow_id): # 工廠開一條新流水線
if workflow_id in self.workflows: # 先睇有冇開過同名嘅流水線
return "exists" # 重複咗 → 用 string「exists」答(唔係 bool)
self.workflows[workflow_id] = [] # 喺 dict 度開條空 list,等陣加 step 入嚟
return "created" # 成功 → 用 string「created」答
def add_step(self, workflow_id, step_id, step_name): # 喺指定流水線尾加一個工序
if workflow_id not in self.workflows: # 條流水線都未開過,根本加唔到
return "workflow not found" # 直接話佢知個 workflow 唔存在
# 行一次條 list,睇下呢個 step_id 之前有冇同名
for existing_id, existing_name in self.workflows[workflow_id]: # 行條 step list 睇有冇撞名
if existing_id == step_id: # 撞名 → 拒收
return "step exists" # 講明係 step 重複,唔係 workflow 重複
self.workflows[workflow_id].append((step_id, step_name)) # 加入 list 尾(保留登記順序,後面 auto-trigger 靠呢個次序)
self.step_status[(workflow_id, step_id)] = "PENDING" # 開個 status 格,初始 PENDING(排緊隊)
self._record(workflow_id, step_id, "NONE", "PENDING") # 寫入 history:從「未存在」變成「PENDING」
self._process_triggers(workflow_id) # 每次改完都行一次 trigger 檢查(L3 先有實質作用)
return "added" # 成功登記
def get_step_status(self, workflow_id, step_id): # 查工序而家做到邊個 status
if workflow_id not in self.workflows: # 條流水線都未開
return "workflow not found" # workflow 唔存在
key = (workflow_id, step_id) # status dict 用 (workflow_id, step_id) tuple 做 key
if key not in self.step_status: # workflow 存在但呢個 step 未登記
return "step not found" # step 唔存在
result = self.step_status[key] # 攞到 status string("PENDING"/"READY"/...)
self._process_triggers(workflow_id) # 即使只係 query 都行一次 trigger(lazy 模型)
return result # 答返 status string
def delete_workflow(self, workflow_id): # 拆走成條流水線,連 3 個 dict 都要清
if workflow_id not in self.workflows: # 根本未開過呢條 → 唔做嘢
return "not found" # 唔存在 → 刪唔到
steps = self.workflows[workflow_id] # 攞返條流水線所有 step(list of tuple)
for step_id, step_name in steps: # 逐個 step 入去 step_status 度清
key = (workflow_id, step_id) # 砌 tuple key
if key in self.step_status: # step_status 入面有呢格先刪
del self.step_status[key] # 拆走每個 (workflow_id, step_id) → status 嘅 entry
del self.workflows[workflow_id] # 再拆走 workflows 嗰格
if workflow_id in self.history: # history 用 defaultdict,可能有可能冇
del self.history[workflow_id] # 連 history 都連根拔起
return "deleted" # 刪除完成
def __init__(self):
self.workflows = {}
self.step_status = {}
self.history = defaultdict(list)
self.workflows = { 流水線目錄(wf_id → list of step tuples)
"order123": [
("step1", "落單"), (step_id, step_name) 按插入順序排
("step2", "入袋"),
],
}
self.step_status = { 每個 step 嘅狀態((wf_id, step_id) → status)
("order123", "step1"): "PENDING", 排緊隊
("order123", "step2"): "PENDING",
}
self.history = defaultdict(list) 狀態變化記錄(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-workflow 鎖(L5 先加)
_record(workflow_id, step_id, old, new)
self.history[workflow_id].append(f"{step_id}: {old}->{new}")
_process_triggers(workflow_id)
L1 入面 call 咗但係冇用(L3 先有實質作用)
progress = 進度 sort_by = 按邊個 field 排序 tie-break = 兩個一樣大嗰陣點分先後
def get_progress(self, workflow_id): # 攞成條流水線嘅進度,砌做一句畀人睇
if workflow_id not in self.workflows: # 條流水線存唔存在?
return "workflow not found" # 唔存在 → return string
self._process_triggers(workflow_id) # 攞之前順手 trigger 一次(lazy 模型,每次 query 都行)
parts = [] # 開個空 list,等陣裝每個 step 嘅字串
for step_id, step_name in self.workflows[workflow_id]: # 條 list 已經保留插入順序,行返一次
status = self.step_status[(workflow_id, step_id)] # 用 tuple key 查 status
parts.append(step_id + "(" + status + ")") # 砌做 "step1(COMPLETED)" 形式塞落 list
return ", ".join(parts) # 用「, 」黏埋成一句,例如 "step1(COMPLETED), step2(PENDING)"
def list_workflows(self, sort_by="id"): # 列晒所有 workflow,兩種排法
if len(self.workflows) == 0: # 條 workflow dict 係空
return "" # 冇任何 workflow → return 空 string(spec 約定)
items = [] # 開個空 list 等住裝 (wf_id, count) tuple
for wf_id in self.workflows: # 逐個 workflow id 行一次
count = len(self.workflows[wf_id]) # step 數量 = 條 list 嘅長度
items.append((wf_id, count)) # 入 list(tuple form 方便等陣 sort)
if sort_by == "id": # 第一種 sort:純字母順序
items.sort(key=lambda x: x[0]) # 用 tuple 第 0 個(wf_id 字串)升序排
elif sort_by == "steps": # 第二種 sort:step 多嘅排先
items.sort(key=lambda x: (-x[1], x[0])) # 加負號令 count 大嘅排先;tie 嗰陣按 id 升序
parts = [] # 再砌字串,同 get_progress 一樣套路
for wf_id, count in items: # 逐個 (wf_id, count) 砌字串
parts.append(wf_id + "(" + str(count) + " steps)") # 例 "wf1(3 steps)"
return ", ".join(parts) # 用「, 」連埋
def __init__(self):
self.workflows = {}
self.step_status = {}
self.history = defaultdict(list)
self.workflows = { 流水線目錄(wf_id → list of step tuples)
"order123": [("step1", "落單"), ("step2", "入袋")],
}
self.step_status = { 每個 step 嘅狀態
("order123", "step1"): "PENDING",
("order123", "step2"): "PENDING",
}
self.history = defaultdict(list) 狀態變化記錄(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-workflow 鎖(L5 先加)
_process_triggers(workflow_id)
get_progress 開頭 call 一次(lazy trigger)
state machine = 狀態機 auto-trigger = 自動推下一步 PENDING → READY → PROCESSING → COMPLETED|FAILED
def ready_step(self, workflow_id, step_id): # 將一個 PENDING 嘅 step 轉做 READY
if workflow_id not in self.workflows: # 條 workflow 都未開 → 拒
return "workflow not found" # workflow 唔存在
key = (workflow_id, step_id) # status dict 用 tuple key 查
if key not in self.step_status: # step 都未登記
return "step not found" # step 唔存在
if self.step_status[key] != "PENDING": # 唔係呢個 status 就唔畀做:只能由 PENDING 起步
return "not pending" # 唔係 PENDING → 拒
self._set_status(workflow_id, step_id, "READY") # helper 順手改 step_status + 記 history
self._process_triggers(workflow_id) # 每次改完 status 都行一次 trigger(lazy)
return "readied" # spec 約定嘅 success string
def start_step(self, workflow_id, step_id): # 將 READY 轉做 PROCESSING(即係廚師開始做)
if workflow_id not in self.workflows: # 條 workflow 都未開 → 拒
return "workflow not found" # workflow 唔存在
key = (workflow_id, step_id) # 砌 tuple key
if key not in self.step_status: # step 都未登記
return "step not found" # step 唔存在
if self.step_status[key] != "READY": # 一定要由 READY 開始(仲 PENDING 都未 ready)
return "not ready" # 唔係 READY → 拒
self._set_status(workflow_id, step_id, "PROCESSING") # helper 順手改 step_status + 記 history
self._process_triggers(workflow_id) # 行一次 trigger(雖然呢個 transition 唔會推下一步)
return "started" # start 成功
def complete_step(self, workflow_id, step_id): # 將 PROCESSING 轉做 COMPLETED(重點:會 auto-trigger 下一個)
if workflow_id not in self.workflows: # workflow 唔存在
return "workflow not found" # workflow 唔存在
key = (workflow_id, step_id) # 砌 tuple key
if key not in self.step_status: # step 唔存在
return "step not found" # step 唔存在
if self.step_status[key] != "PROCESSING": # 仲未開始做,complete 乜嘢?
return "not processing" # 唔係 PROCESSING → 拒
self._set_status(workflow_id, step_id, "COMPLETED") # helper 順手改 step_status + 記 history
self._process_triggers(workflow_id) # ← 重點!complete 之後 trigger 會將下一個 PENDING step 推上 READY
return "completed" # complete 成功
def __init__(self):
self.workflows = {}
self.step_status = {}
self.history = defaultdict(list)
self.workflows = { 流水線目錄
"order123": [("step1","落單"),("step2","入袋"),("step3","出單"),("step4","送貨")],
}
self.step_status = { 每個 step 嘅狀態
("order123", "step1"): "COMPLETED", 做完
("order123", "step2"): "PROCESSING", 做緊
("order123", "step3"): "READY", 準備好(auto-trigger 推上嚟)
("order123", "step4"): "PENDING", 排緊隊
}
self.history = defaultdict(list) 狀態變化記錄(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-workflow 鎖(L5 先加)
_set_status(workflow_id, step_id, new_status)
記低 history + 改 step_status
_process_triggers(workflow_id)
搵到 COMPLETED → 下一個 PENDING 自動轉 READY
rollback = 回滾 fail = 失敗 history = 狀態變更記錄 一 fail 就退翻晒
def fail_step(self, workflow_id, step_id): # step 失敗 → mark FAILED + rollback 全部已完成嘅 step
if workflow_id not in self.workflows: # 條流水線未開過
return "workflow not found" # workflow 唔存在
key = (workflow_id, step_id) # status dict 嘅 tuple key
if key not in self.step_status: # step 唔存在
return "step not found" # step 唔存在
if self.step_status[key] != "PROCESSING": # 唔係「做緊」就冇資格 fail — PENDING/READY 未開始點 fail?COMPLETED 做完又點 fail?
return "not processing" # 唔係 PROCESSING → 拒
# ── 第一件事:將出事嗰個 step mark 做 FAILED ──
self._set_status(workflow_id, step_id, "FAILED") # step3: PROCESSING → FAILED(順手記 history)
# ── 第二件事:rollback — 行成條 workflow,所有 COMPLETED 退翻 PENDING ──
for s_id, s_name in self.workflows[workflow_id]: # 逐個 step 睇
s_key = (workflow_id, s_id) # 砌 tuple key
if self.step_status[s_key] == "COMPLETED": # 之前做完嘅 step
self._set_status(workflow_id, s_id, "PENDING") # 退翻 PENDING(前面嘅成果全部 reset)
self._process_triggers(workflow_id) # 例牌 call trigger(rollback 之後冇 COMPLETED 剩,唔會推任何嘢)
return "failed and rolled back" # return string 講晒兩件事
def get_history(self, workflow_id): # 攞返成條 workflow 所有 status 變更嘅記錄
if workflow_id not in self.history: # 從來未 record 過
return [] # return 空 list(唔係空 string!)
result = [] # 開個新 list,唔好直接 return internal 嗰個
for entry in self.history[workflow_id]: # 逐個 entry copy 出嚟
result.append(entry) # 防止 caller 亂改你 internal state
return result # list of string,例如 ["step1: PENDING->READY", ...]
def __init__(self):
self.workflows = {}
self.step_status = {}
self.history = defaultdict(list)
self.workflows = { 流水線目錄
"order123": [("step1","落單"),("step2","入袋"),("step3","出單"),("step4","送貨")],
}
self.step_status = { 每個 step 嘅狀態
("order123", "step1"): "PENDING", rollback 退返
("order123", "step2"): "PENDING", rollback 退返
("order123", "step3"): "FAILED", 出事嗰個
("order123", "step4"): "PENDING", 本來就 PENDING
}
self.history = { 狀態變化記錄(L4 加)
"order123": [
"step1: NONE->PENDING",
"step1: PENDING->READY",
"step1: READY->PROCESSING",
"step1: PROCESSING->COMPLETED",
"step3: PROCESSING->FAILED",
"step1: COMPLETED->PENDING",
"step2: COMPLETED->PENDING",
],
}
self.locks = defaultdict(asyncio.Lock) per-workflow 鎖(L5 先加)
工廠流水線做到一半出事。佢做兩件事:
開頭狀態:
step1: COMPLETED ← 做完
step2: COMPLETED ← 做完
step3: PROCESSING ← 做緊(出事嘅係呢個)
step4: PENDING ← 未到佢
call fail_step("wf", "step3")
第一件:將 step3 mark 做 FAILED
第二件:Rollback — 所有 COMPLETED 退翻 PENDING
結果:
step1: PENDING ← 由 COMPLETED 退翻
step2: PENDING ← 由 COMPLETED 退翻
step3: FAILED ← 失敗嗰個(唔會退自己)
step4: PENDING ← 本身就 PENDING
點解 rollback?step3 出錯 = 前面成果可能唔可信,整條線由頭嚟過。
點解 step3 自己唔退?留喺 FAILED 做記錄。
由 L1 開始每次 status 變都有寫 history(透過 _set_status → _record)。
L1-L3 冇 method 攞出嚟睇。L4 先加 get_history。
Return list[str](唔係 string)。冇就 return []。
get_history("wf") → [
"step1: NONE->PENDING",
"step1: PENDING->READY",
"step1: READY->PROCESSING",
"step1: PROCESSING->COMPLETED",
"step3: PROCESSING->FAILED",
"step1: COMPLETED->PENDING",
"step2: COMPLETED->PENDING",
]
_set_status(wf, step, new)
fail_step 入面一次 call FAILED,再 loop call PENDING
每次都順手記 history
_process_triggers(wf)
fail_step 完之後仲係 call(雖然冇嘢可以推)
async = 唔阻塞 batch = 一次過做幾單嘢 lock = 鎖(防止爭住改) defaultdict(Lock) = 每個 wf 一把鎖
async batch_operations(ops) takes a list of dicts {"action", "workflow_id", "step_id"}. // 批量做 start_step / complete_step
Run all ops concurrently with asyncio.gather. // 全部同時開動
Use defaultdict(asyncio.Lock) keyed by workflow_id — same workflow's ops are serialized, different workflows can run concurrently. // 每個 workflow 一把鎖
async def batch_operations(self, ops): # 一次過做多個 op,同一 workflow 排隊、唔同 workflow 並行
async def run_one(op): # 內部 helper:處理一個 op
action = op["action"] # 拆 op dict 出嚟("start_step" / "complete_step")
wf_id = op["workflow_id"] # 攞 workflow id
s_id = op["step_id"] # 攞 step id
lock = self.locks[wf_id] # 攞呢個 workflow 嘅 lock(defaultdict 自動造一把)
async with lock: # 鎖住呢個 workflow(同 wf 嘅 op 排隊行)
if action == "start_step": # dispatch 去返 sync method
return self.start_step(wf_id, s_id) # sync method return string
elif action == "complete_step": # 完成 step
return self.complete_step(wf_id, s_id) # 派去 complete_step
else: # 未知 action
return "unknown action" # 防呆:unknown action 唔好 raise,return string
tasks = [] # 開個 list 等住裝 coroutine
for op in ops: # 逐個 op 造 coroutine
tasks.append(run_one(op)) # call run_one(op) 即時 return 個 coroutine(未跑)
results = await asyncio.gather(*tasks) # gather 一次過 schedule 晒,等晒做完先 return
return list(results) # results 已經 list,呢度只係 explicit cast
def __init__(self):
self.workflows = {}
self.step_status = {}
self.history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock) ← L5 加
self.locks = defaultdict(asyncio.Lock)
佢嘅樣:
self.locks = {
"A": <Lock>, ← workflow A 專用嘅一把鎖
"B": <Lock>, ← workflow B 專用嘅一把鎖
}
defaultdict 嘅魔法:
self.locks["A"] ← 第一次問「A」→ 自動造一把 Lock() 存入去 → return 嗰把
self.locks["A"] ← 第二次問「A」→ 已經有 → return 同一把
self.locks["C"] ← 第一次問「C」→ 自動造一把新嘅
即係:
self._lock = asyncio.Lock() ← 1 把鎖
self.locks = defaultdict(asyncio.Lock) ← 一個 dict of Lock,每個 key 一把
TaskQueue:
self.tasks 係一個 dict,所有 task 住同一度
worker A 攞 task 要改 dict,worker B 都要改同一個 dict
一定會撞 → 一把鎖保護成個 dict
Workflow:
改 workflow A 嘅 step_status 同改 B 嘅唔會撞
A 嘅 key 係 ("A","step1"),B 嘅 key 係 ("B","step1")
唔同 entry → 唔需要排同一條隊
用准考證講:
TaskQueue = 成間餐廳得 1 張准考證,10 個廚師排同一條隊
Workflow = 每條流水線各 1 張,A 排 A 嘅隊,B 排 B 嘅隊
A 同 B 唔洗等對方
Lock = 准考證得 1 張。想進場做嘢就要攞到呢張准考證。做完還返,下一個先可以攞。
但 Workflow L5 唔係全場得 1 張准考證 — 係每個 workflow 各有 1 張:
workflow "A" 有自己一張准考證
workflow "B" 有自己一張准考證
ops = [
start_step(A, s1) ← 要攞 A 嘅准考證
complete_step(A, s2) ← 都係要攞 A 嘅准考證
start_step(B, s1) ← 要攞 B 嘅准考證
]
gather 一次過開動 3 個 op:
A 嘅 s1 先攞到 A 准考證 → 做 start_step → 做完還證
A 嘅 s2 等住 A 准考證 → s1 還咗先輪到佢 → 做 complete_step
B 嘅 s1 攞 B 准考證 → 同 A 冇關係,即刻做
結果:A 嘅 2 個 op 排隊;B 同 A 並行
defaultdict(asyncio.Lock) 就係自動發准考證嘅機器:第一次有人問 self.locks["A"],佢即刻印一張 A 專屬嘅准考證出嚟。下次再問就 return 同一張。每個 workflow 自動有獨立一張,唔使你自己 manage。
L5 Lock = 每個 workflow 1 張准考證 → 同 workflow 排隊
L6 Semaphore(N) = 全場 N 張准考證 → 最多 N 個同時做
核心分別就一個:Lock 掛喺邊度。
TaskQueue L5:全局一把 self._lock
→ 所有 op 排隊(serial)
→ 因為全部 task 住喺同一個 self.tasks dict
Workflow L5:每個 workflow 一把 self.locks[wf_id]
→ 同 workflow 排隊,唔同 workflow 並行
→ 因為 workflow A 嘅 step_status 同 B 唔會撞
但寫法幾乎一樣:
TaskQueue L5 — 全局 lock:
async with self._lock:
return self.complete_task(ts, tid)
Workflow L5 — per-workflow lock:
lock = self.locks[wf_id]
async with lock:
return self.complete_step(wf_id, s_id)
差別只係 self._lock(成間餐廳共用一張准考證)定 self.locks[wf_id](每條流水線各一張)。其他 pattern 完全一樣:內部 run_one function → for loop 砌 coroutine list → gather 一齊跑 → return list(results)。
tasks = [] 開個空 list
for op in ops: 逐個 op
tasks.append(run_one(op)) call run_one → return 一個 coroutine
coroutine =「承諾會做呢件事」但未開始做
寫低咗「要做乜」但仲未動手
呢個時候 tasks = [coroutine_1, coroutine_2, coroutine_3]
3 個都仲未跑,只係排住喺 list 度
results = await asyncio.gather(*tasks)
*tasks = 將 list 拆開做獨立參數
即係 gather(coroutine_1, coroutine_2, coroutine_3)
gather 做嘅嘢:
同時 schedule 3 個 coroutine 跑
等到 3 個全部做完
將 3 個 return value 收埋做一個 list
results = ["started", "completed", "started"]
順序同 input 一樣
return list(results) 確保一定係 list
點解唔直接 for loop 逐個做?因為逐個做 = serial,要等第 1 個做完先做第 2 個。gather = 同時跑,快好多。Lock 會自動處理邊啲要排隊、邊啲可以並行。
tasks = []
for op in ops:
tasks.append(run_one(op))
results = await asyncio.gather(*tasks)
return list(results)
每個 mock 照抄呢 5 行,只改兩個位:
1. run_one 入面 dispatch 去邊個 method
2. lock 掛邊度(全局 self._lock 定 per-key self.locks[key])
冇新 helper
直接 call 返 L3/L4 嘅 start_step / complete_step
鎖只係 wrap 住佢哋
external_call = 外部 API semaphore = 限制同時幾多個 READY check = 唔合資格即 skip
async execute_steps(workflow_id, step_ids, external_call, max_concurrent) runs each step's full lifecycle. // 接受外部 function 做埋
For each step: if READY, call start_step → await external_call(wf_id, step_id) → complete_step. // READY 先做
If NOT READY: skip, return "skipped:step_id". // 唔啱即走
Use asyncio.Semaphore(max_concurrent) to limit concurrent executions. // sem 限制 N 個
Use the per-workflow lock around start_step and complete_step. // state 變更要鎖
Return list: "executed:id" / "skipped:id" / "error:id:msg". // 3 種結果
async def execute_steps(self, workflow_id, step_ids, external_call, max_concurrent): # 將多個 step 由 READY 帶到 COMPLETED;每步都要守住 workflow 狀態轉場
semaphore = asyncio.Semaphore(max_concurrent) # 准考證 N 張(同時最多 N 個 step 可以行)
async def run_one(step_id): # 內部 helper:處理一個 step 嘅完整 lifecycle
async with semaphore: # 攞一張准考證(攞唔到就等)
key = (workflow_id, step_id) # status dict 嘅 tuple key
# Fail-fast check:唔 READY 嘅 step 全部 skip
if key not in self.step_status: # step 都未登記
return "skipped:" + step_id # step 唔存在 → skip
if self.step_status[key] != "READY": # 唔係 READY → 唔執行
return "skipped:" + step_id # 唔係 READY → skip
try: # 包 try/except:external_call 可能爆
# 1) start_step(短時間鎖住改 state)
lock = self.locks[workflow_id] # 用 per-key lock 防止 state 撞車
async with lock: # 入 lock 改 state
start_result = self.start_step(workflow_id, step_id) # READY → PROCESSING
if start_result != "started": # start 唔成功(race condition 等)
return "skipped:" + step_id # start 失敗 → skip
# 2) 調外部 service(出鎖去做,因為可能慢)
await external_call(workflow_id, step_id) # 鎖已釋放,其他 step 唔會塞住
# 3) complete_step(再短時間鎖住改 state)
async with lock: # 再入 lock 改 state
complete_result = self.complete_step(workflow_id, step_id) # PROCESSING → COMPLETED
return "executed:" + step_id # 三步全部成功
except Exception as e: # external_call 拋 exception
return "error:" + step_id + ":" + str(e) # 個別 step 失敗唔影響其他
tasks = [] # 開 list 等住裝 coroutine
for step_id in step_ids: # 逐個 step 造 coroutine
tasks.append(run_one(step_id)) # 為每個 step_id 造一個 coroutine
results = await asyncio.gather(*tasks) # 全部一齊開動(受 sem 限流)
return list(results) # 結果順序同 step_ids 一致
def __init__(self):
self.workflows = {}
self.step_status = {}
self.history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock)
self.workflows = { 流水線目錄
"order123": [("step1","落單"),("step2","入袋")],
}
self.step_status = { 每個 step 嘅狀態
("order123", "step1"): "COMPLETED",
("order123", "step2"): "READY",
}
self.history = defaultdict(list) 狀態變化記錄
self.locks = defaultdict(asyncio.Lock) per-workflow 鎖
semaphore 喺 method 入面開(per-call)
L6 有 3 種 pattern,Workflow 用第三種:
1. Fail-fast + Sleep(最簡單)
check 唔合格即走,合格就 sem + sleep + return True/False
用:Hashring, DNS, Session, Notification, ChatRoute
2. Worker Pool
搶 task → sem 入面做 → mark done
用:TaskQueue
3. Lifecycle(Workflow 用呢個)
check → sem → lock(start) → external_call → lock(complete)
有 try/except,有 3 種 return string
用:Workflow, PackageMgr
async with semaphore: 入 sem(限流)
async with lock: 入 lock
start_step(wf, step) 改 status: READY → PROCESSING
出 lock(好快,只係改一個值)
await external_call(wf, step) 呢度冇 lock!
外部 API 可能要 2 秒
鎖住嘅話其他 step 要等 2 秒
所以出鎖做
async with lock: 再入 lock
complete_step(wf, step) 改 status: PROCESSING → COMPLETED
出 lock
如果成個鎖住:external_call 要 2 秒,同 workflow 嘅其他 step 要等 2 秒。
出鎖去做:只鎖改 status 嗰一下下(幾 ms),慢都唔影響其他 step。
TaskQueue L6:1 步
check COMPLETED → sem → sleep → mark DISPATCHED → True/False
Workflow L6:3 步 + 出入鎖
check READY → sem → lock(start) → 出鎖 → external_call → lock(complete)
多咗嘅嘢:
1. try/except — external_call 可能爆
2. 3 種 return — "executed:id" / "skipped:id" / "error:id:msg"
3. start_result check — lock 出嚟後 check 有冇成功
冇新 helper
直接 call 返 L3 嘅 start_step / complete_step
同 L5 一樣用 self.locks[workflow_id]
── Helpers ── 🟰 _process_deprecations 同 Bank _process_cashbacks 一樣 lazy TTL pattern ⚠️ _has_circular_dep TaskQueue 冇(PkgMgr 獨有 BFS circular check) ── L1 CRUD ── 🟰 register 同 TaskQueue add_task 一樣(check exist → add → string) ⚠️ install TaskQueue 冇(PkgMgr 獨有) ⚠️ uninstall TaskQueue 冇(PkgMgr 獨有) 🟰 get_status 同 TaskQueue get_task_status ── L2 Sort ── 🟰 list_installed 同 Workflow list_workflows 一樣 pattern 🟰 search_by_prefix 同 InMemDB scan_by_prefix 一樣 ── L3 Deprecation TTL ── 🟰 register (L3) 同 FS add_file_with_ttl 一樣 TTL pattern ⚠️ install (L3) 加 deprecated check(TTL 獨有 logic) ── L4 Dependencies ── ⚠️ add_dependency 類似 TaskQueue deps 但多咗 circular check(BFS) ⚠️ install (L4) 類似 TaskQueue _deps_met 但 check "installed" ⚠️ uninstall (L4) 獨有:check 有冇人依賴佢 🟰 get_dependency_history 同 Workflow get_history ── L5 Batch ── 🟰 batch_operations 同 Workflow L5(lock per pkg_id) ── L6 Download ── ⚠️ download_packages Lifecycle pattern(同 Workflow L6 類似)
裝 app 嘅系統。有啲 app 要先裝其他 app 先用到(dependency)。要寫個 class 模擬。
想像 app store 入面有一堆 package:
┌─────────────────────────────────────────┐
│ pkg_a (v1.0) ← 冇 dependency │
│ pkg_b (v2.0) ← 要先裝 pkg_a │
│ pkg_c (v1.5) ← 要先裝 pkg_a 同 pkg_b │
│ pkg_d (v3.0) ← deprecated(過期咗) │
└─────────────────────────────────────────┘
每個 package 有:
pkg_id = 編號("pkg_a")
version = 版本("1.0.0")
三個狀態軸:
1. 註冊咗未(registered)— 喺 app store 入面有冇
2. 裝咗未(installed)— 用戶部機有冇裝
3. 過期咗未(deprecated)— TTL 到咗就 deprecated
例:用戶想裝 pkg_c
pkg_c 嘅 dependencies = [pkg_a, pkg_b]
install(pkg_c):
check pkg_a 裝咗未?未 → "missing dependency:pkg_a"
要先:install(pkg_a) → install(pkg_b) → install(pkg_c)
uninstall(pkg_a):
check 有冇人依賴 pkg_a?有 pkg_b 同 pkg_c
→ "dependency conflict:pkg_b"
要先 uninstall 走 pkg_b 同 pkg_c 先得
後面 level 加多啲嘢:
L2 加 list_installed 同 search_by_prefix
L3 加 deprecation TTL(過期就 DEPRECATED)
L4 加 dependency graph + circular check
L5 加 async batch + lock per pkg
L6 加 download + external_call
import asyncio
import time
from collections import defaultdict
class PackageManager:
def __init__(self, clock=None):
self.packages = {} # L1 pkg_id → version
self.installed = {} # L1 pkg_id → True/False
self.deprecation_times = {} # L3 加:pkg_id → 過期時間
self.deprecated = {} # L3 加:pkg_id → True
self.dependencies = defaultdict(list) # L4 加:pkg_id → [dep_ids]
self.dep_history = defaultdict(list) # L4 加:pkg_id → [event 字串]
self.locks = defaultdict(asyncio.Lock) # L5 加:每個 pkg 一把鎖
if clock is not None: # L3 加:可以傳入假時鐘做測試
self.clock = clock # 更新 self.clock
else: # 冇假時鐘就退返用真實毫秒時間;平時實際跑就會行呢邊
self.clock = lambda: time.time() * 1000 # 之後所有 TTL / 排序時間都會跟住真實世界個鐘行
self.packages = {
"pkg_a": "1.0.0",
"pkg_b": "2.0.0",
"pkg_c": "1.5.0",
}
self.installed = {
"pkg_a": True, # 已裝
"pkg_b": False, # 註冊咗但未裝
"pkg_c": True, # 已裝
}
兩個 dict 分開記。packages = 「有冇喺 app store」,installed = 「用戶部機裝咗未」。
L1:packages, installed 兩個 parallel dict
L2:(冇加新 field,只加 query method)
L3:deprecation_times, deprecated, clock
L4:dependencies, dep_history
L5:locks
L6:(冇加新 field,semaphore 喺 method 入面開)
# L3 用:行一次睇邊個 package TTL 到咗,標記做 DEPRECATED
def _process_deprecations(self): # 清走過期嘅 package(lazy)
now = self.clock() # 攞而家時間
for pkg_id in list(self.packages.keys()): # 逐個 package 睇
dep_time = self.deprecation_times.get(pkg_id, 0) # 攞過期時間(默認 0 = 唔過期)
if dep_time > 0 and now >= dep_time: # 有設過期 + 時間到咗
if pkg_id not in self.deprecated: # 未標記過先標記
self.deprecated[pkg_id] = True # 標記做 deprecated
# L4 用:check 加呢條 dependency 會唔會造成循環
# BFS 由 depends_on 出發,睇下行唔行得返去 pkg_id
def _has_circular_dep(self, pkg_id, depends_on): # 用 BFS check 有冇 circular dependency
visited = set() # 行過嘅 node,唔再行第二次
queue = [depends_on] # 由 depends_on 開始
while len(queue) > 0: # while loop
current = queue.pop(0) # 攞 queue 第一個出嚟
if current == pkg_id: # 行返去 pkg_id → 有循環
return True # 成功就返 True;caller 可以當今次動作真係做咗
if current in visited: # 行過 → 跳過
continue # 跳過
visited.add(current) # 呢個 package 已經巡過;避免 BFS 兜圈重覆行返同一個點
for dep in self.dependencies[current]: # 將 current 嘅 dependencies 全部加入 queue
queue.append(dep) # 將呢項塞入 list,留待之後一齊處理或回傳
return False # 行完都冇行返 pkg_id → 冇循環
唔搵 background thread 定時 check。每次調用 public method 之前 call 一次呢個 helper,當場 scan 一次成個 self.packages。
時間夠咗就標記做 deprecated。
例:clock 而家係 1000
self.packages = {"pkg_a": "1.0", "pkg_b": "2.0"}
self.deprecation_times = {"pkg_a": 500, "pkg_b": 2000}
_process_deprecations():
pkg_a: dep_time=500, now=1000, 500>0 同 1000>=500 → deprecated
pkg_b: dep_time=2000, now=1000, 2000>0 但 1000<2000 → 仲未
self.deprecated = {"pkg_a": True}
想加條 dependency 「pkg_id 依賴 depends_on」之前,要 check 加咗會唔會做成循環。
方法:由 depends_on 出發 BFS 搵所有可達嘅 node,如果搵到返 pkg_id,就證明加咗之後會繞返一個圈。
例:而家已經有
self.dependencies = {
"pkg_b": ["pkg_a"], pkg_b 依賴 pkg_a
"pkg_c": ["pkg_b"], pkg_c 依賴 pkg_b
}
想加「pkg_a 依賴 pkg_c」 → _has_circular_dep("pkg_a", "pkg_c")
queue = ["pkg_c"]
pop "pkg_c":current=pkg_c,唔等於 pkg_a;加入 pkg_c 嘅 deps = ["pkg_b"]
queue = ["pkg_b"]
pop "pkg_b":current=pkg_b,唔等於 pkg_a;加入 pkg_b 嘅 deps = ["pkg_a"]
queue = ["pkg_a"]
pop "pkg_a":current=pkg_a == pkg_id → return True
即係加咗之後會變 pkg_a → pkg_c → pkg_b → pkg_a 一個圈
register = 註冊上 app store install = 用戶部機裝 uninstall = 用戶部機卸 status = 而家咩狀態
def register(self, pkg_id, version): # 將個 package 註冊上 app store
if pkg_id in self.packages: # 已經有呢個 pkg_id
return "exists" # 唔再加
self.packages[pkg_id] = version # 記低個 version
self.installed[pkg_id] = False # 初始狀態:註冊咗但未裝
return "registered" # 成功註冊
🟰 同 TaskQueue get_task_status 一樣 pattern
def get_status(self, pkg_id): # 查 package 狀態
if pkg_id not in self.packages: # 根本未註冊
return "not found" # 查無此 package
if self.installed[pkg_id] is True: # 裝咗
return "INSTALLED" # 已裝
return "REGISTERED" # 註冊咗但未裝
⚠️ TaskQueue 冇對應 — PkgMgr 獨有(改 installed dict)
def install(self, pkg_id): # 安裝 package
if pkg_id not in self.packages: # 根本未註冊
return "not registered" # 連 store 都未有
if self.installed[pkg_id] is True: # 已經裝咗
return "already installed" # 唔再裝
self.installed[pkg_id] = True # 標記做已裝
return "installed" # 裝成功
⚠️ TaskQueue 冇對應 — PkgMgr 獨有(del installed)
def uninstall(self, pkg_id): # 卸載 package
if pkg_id not in self.packages: # 根本未註冊
return "not registered" # 冇呢個 package
if self.installed[pkg_id] is False: # 註冊咗但未裝
return "not installed" # 冇裝過邊度嚟卸
self.installed[pkg_id] = False # 標記做未裝
return "uninstalled" # 卸成功
def __init__(self):
self.packages = {} # pkg_id → version string
self.installed = {} # pkg_id → True/False
self.packages = { 套件登記冊(pkg_id → version)
"pkg_a": "1.0.0", version string
"pkg_b": "2.3.1",
}
self.installed = { 裝咗未(pkg_id → True/False)
"pkg_a": True, 已裝
"pkg_b": False, 未裝
}
self.deprecation_times = {} 過期時間(L3 先加)
self.deprecated = {} 已過期標記(L3 先加)
self.dependencies = defaultdict(list) 依賴關係(L4 先加)
self.dep_history = defaultdict(list) 依賴事件記錄(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-pkg 鎖(L5 先加)
list_installed = 列出已裝嘅 search_by_prefix = 用 prefix 搵 format = "pkg1(1.0.0), pkg2(2.3.1)"
def list_installed(self, sort_by="id"): # 列出已裝嘅 package
items = [] # 暫存已裝嘅 (pkg_id, version) tuple
for pkg_id in self.packages: # 逐個 package 睇
if self.installed.get(pkg_id, False) is True: # 只攞已裝嘅
version = self.packages[pkg_id] # 攞 version
items.append((pkg_id, version)) # 入暫存 list
if len(items) == 0: # 一個都冇
return "" # 返空 string
if sort_by == "id": # 按 id 排
items.sort(key=lambda x: x[0]) # x[0] = pkg_id,字母升序
elif sort_by == "version": # 按 version 排
items.sort(key=lambda x: (x[1], x[0])) # 先 version,tie-break 用 id
parts = [] # 砌 output 字串
for pkg_id, version in items: # 逐個轉做 "pkg_a(1.0.0)" 格式
parts.append(pkg_id + "(" + version + ")") # 砌單個 entry
return ", ".join(parts) # 用 ", " 串埋一齊
🟰 同 InMemDB scan_by_prefix 一樣 pattern(startswith filter)
def search_by_prefix(self, prefix): # 用 prefix 搵 package
matches = [] # 暫存 match 到嘅 (pkg_id, version)
for pkg_id in self.packages: # 逐個 package 睇
if pkg_id.startswith(prefix): # startswith 做 prefix match
version = self.packages[pkg_id] # 攞 version
matches.append((pkg_id, version)) # 入暫存 list
if len(matches) == 0: # 冇 match
return "" # 返空 string
matches.sort(key=lambda x: x[0]) # 按 pkg_id 字母排
parts = [] # 砌 output
for pkg_id, version in matches: # 逐個轉格式
parts.append(pkg_id + "(" + version + ")") # 砌單個 entry
return ", ".join(parts) # 用 ", " 串埋一齊
def __init__(self):
self.packages = {}
self.installed = {}
self.packages = { 套件登記冊(pkg_id → version)
"pkg_a": "1.0.0",
"pkg_b": "2.3.1",
"pkg_c": "1.5.0",
}
self.installed = { 裝咗未(pkg_id → True/False)
"pkg_a": True, 已裝
"pkg_b": True, 已裝
"pkg_c": False, 未裝
}
self.deprecation_times = {} 過期時間(L3 先加)
self.deprecated = {} 已過期標記(L3 先加)
self.dependencies = defaultdict(list) 依賴關係(L4 先加)
self.dep_history = defaultdict(list) 依賴事件記錄(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-pkg 鎖(L5 先加)
TTL = time-to-live deprecation_time = 幾時過期 lazy = 唔用 background thread,每次 call 先 check clock = 假時鐘做測試
# ─────────── register L3 改版 ───────────
# 多咗 ttl_ms 參數,過期之後個 package 會變 DEPRECATED
🟰 同 FS add_file_with_ttl 一樣 pattern(expires_at = ts + ttl_ms)
def register(self, pkg_id, version, ttl_ms=0): # 註冊一個新 package
self._process_deprecations() # 每個 public method 開頭先 check 過期
if pkg_id in self.packages: # 已經有
return "exists" # 唔再加
self.packages[pkg_id] = version # 記低 version
self.installed[pkg_id] = False # 初始狀態:未裝
if ttl_ms > 0: # L3 加:如果有設 TTL
self.deprecation_times[pkg_id] = self.clock() + ttl_ms # 記低絕對過期時間
return "registered" # 成功註冊
# ─────────── get_status L3 改版 ───────────
🟰 同 InMemDB get_at 類似(check TTL expiry inline)
def get_status(self, pkg_id): # 查 package 狀態
self._process_deprecations() # 先 check 過期
if pkg_id not in self.packages: # 未註冊
return "not found" # 查無此 package
if pkg_id in self.deprecated: # L3 加:deprecated 優先
return "DEPRECATED" # 已過期
if self.installed[pkg_id] is True: # 裝咗
return "INSTALLED" # 已裝
return "REGISTERED" # 註冊咗但未裝
# ─────────── install L3 改版 ───────────
# deprecated 咗就唔可以 install
⚠️ PkgMgr 獨有:check deprecated status before install
def install(self, pkg_id): # 安裝 package
self._process_deprecations() # 先 check 過期
if pkg_id not in self.packages: # 未註冊
return "not registered" # 冇呢個 package
if pkg_id in self.deprecated: # L3 加:deprecated 嘅唔畀裝
return "deprecated" # 已過期
if self.installed[pkg_id] is True: # 已裝
return "already installed" # 唔再裝
self.installed[pkg_id] = True # 標記已裝
return "installed" # 裝成功
def __init__(self, clock=None):
self.packages = {}
self.installed = {}
self.deprecation_times = {} ← L3 加:pkg_id → 過期時間
self.deprecated = {} ← L3 加:pkg_id → True
if clock is not None: ← L3 加:可以傳入假時鐘
self.clock = clock
else:
self.clock = lambda: time.time() * 1000
self.packages = { 套件登記冊(pkg_id → version)
"pkg_a": "1.0.0",
"pkg_b": "2.0.0",
}
self.installed = { 裝咗未
"pkg_a": True,
"pkg_b": False,
}
self.deprecation_times = { 過期時間(L3 加)
"pkg_a": 1500, timestamp >= 1500 就過期
}
self.deprecated = { 已過期標記(L3 加)
"pkg_a": True, 已經 deprecated
}
self.dependencies = defaultdict(list) 依賴關係(L4 先加)
self.dep_history = defaultdict(list) 依賴事件記錄(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-pkg 鎖(L5 先加)
注意:deprecation_times 同 deprecated 唔一定全部 pkg 都有。用 .get(pkg_id, 0) / "in" check 嚟睇。
register 多咗 ttl_ms 參數。install 加 deprecated check。get_status 加 DEPRECATED 狀態。全部 public method 開頭加 _process_deprecations()。
dependency = 前置 package circular = pkg_a→pkg_b→pkg_a 繞個圈 conflict = uninstall 時有人依賴呢個 history = 記低所有 dep event
def add_dependency(self, pkg_id, depends_on): # 加一條 dependency
self._process_deprecations() # 先 check 過期
if pkg_id not in self.packages: # pkg_id 未註冊
return "not registered" # 兩邊任一個未註冊就唔得
if depends_on not in self.packages: # depends_on 未註冊
return "not registered" # 同上
if self._has_circular_dep(pkg_id, depends_on): # 用 BFS helper check 循環
return "circular dependency" # 加咗會繞圈
if pkg_id == depends_on: # 自己依賴自己都當循環
return "circular dependency" # 即走
self.dependencies[pkg_id].append(depends_on) # 加入 dependency list
event = "added dep: " + depends_on # 砌 event string
self.dep_history[pkg_id].append(event) # 記低呢個 event
return "dependency added" # 加成功
# ─────────── install L4 改版 ───────────
# 裝之前要 check 所有 dependency 都裝咗
⚠️ 類似 TaskQueue _deps_met,但 check "installed" 唔係 "COMPLETED"
def install(self, pkg_id): # 安裝 package
self._process_deprecations() # 先 check 過期
if pkg_id not in self.packages: # 未註冊
return "not registered" # 冇呢個 package
if pkg_id in self.deprecated: # 已過期
return "deprecated" # 唔畀裝
if self.installed[pkg_id] is True: # 已裝
return "already installed" # 唔再裝
# L4 加:逐個 dependency check,要全部都 installed 先可以裝呢個
for dep_id in self.dependencies[pkg_id]: # 逐個 dep 睇
if not self.installed.get(dep_id, False): # 有 dep 未裝
event = "install blocked: missing " + dep_id # 砌 event
self.dep_history[pkg_id].append(event) # 記低 blocked event
return "missing dependency:" + dep_id # 缺 dep,唔畀裝
self.installed[pkg_id] = True # 全部 dep 都裝咗,可以裝
return "installed" # 裝成功
# ─────────── uninstall L4 改版 ───────────
# 卸之前要 check 冇其他已裝嘅 package 依賴呢個
⚠️ PkgMgr 獨有:要 check 有冇人依賴佢先可以卸
def uninstall(self, pkg_id): # 卸載 package
self._process_deprecations() # 先 check 過期
if pkg_id not in self.packages: # 未註冊
return "not registered" # 冇呢個 package
if self.installed[pkg_id] is False: # 未裝
return "not installed" # 冇裝過邊度嚟卸
# L4 加:逐個其他 package 睇有冇人依賴呢個
for other_id in self.packages: # scan 所有 package
if other_id == pkg_id: # 跳過自己
continue # 跳過
if self.installed.get(other_id, False): # 只睇已裝嘅
for dep_id in self.dependencies[other_id]: # 睇佢依賴邊啲
if dep_id == pkg_id: # 有人依賴呢個 → conflict
return "dependency conflict:" + other_id # 唔畀卸
self.installed[pkg_id] = False # 冇人依賴,可以卸
return "uninstalled" # 卸成功
🟰 同 Workflow get_history 一樣 pattern(return list[str])
def get_dependency_history(self, pkg_id): # 攞 dependency 事件記錄
self._process_deprecations() # 先 check 過期
if pkg_id not in self.packages: # 未註冊
return [] # 返空 list
result = [] # 開新 list(copy 一份,唔好返 internal list)
for entry in self.dep_history[pkg_id]: # 逐個 event copy
result.append(entry) # 入 list
return result # 返 event history
def __init__(self, clock=None):
self.packages = {}
self.installed = {}
self.deprecation_times = {}
self.deprecated = {}
self.dependencies = defaultdict(list) ← L4 加
self.dep_history = defaultdict(list) ← L4 加
if clock is not None:
self.clock = clock
else:
self.clock = lambda: time.time() * 1000
self.packages = { 套件登記冊
"pkg_a": "1.0",
"pkg_b": "2.0",
"pkg_c": "1.5",
}
self.installed = { 裝咗未
"pkg_a": True,
"pkg_b": True,
"pkg_c": False,
}
self.deprecation_times = {} 過期時間
self.deprecated = {} 已過期標記
self.dependencies = { 依賴關係(L4 加)
"pkg_b": ["pkg_a"], pkg_b 依賴 pkg_a
"pkg_c": ["pkg_a", "pkg_b"], pkg_c 依賴 pkg_a 同 pkg_b
}
self.dep_history = { 依賴事件記錄(L4 加)
"pkg_b": ["added dep: pkg_a"],
"pkg_c": [
"added dep: pkg_a",
"added dep: pkg_b",
"install blocked: missing pkg_b", 試裝失敗都記
],
}
self.locks = defaultdict(asyncio.Lock) per-pkg 鎖(L5 先加)
注意 dependencies 用 defaultdict(list),冇 key 嘅都自動 return [],唔會 KeyError。
install 加 dependency check(逐個 dep 睇裝咗未)。uninstall 加 conflict check(有人依賴就唔畀卸)。
_has_circular_dep(pkg_id, depends_on) - BFS 由 depends_on 行到底,行返 pkg_id 就有循環
batch = 一次過做多個 op lock per pkg_id = 每個 pkg 獨立把鎖 gather = 同時跑全部 op
batch_operations(ops) 接受 list of {"action": "install"|"uninstall", "pkg_id": ...}。
用 asyncio.gather 同時跑全部 op。每個 pkg_id 用獨立把 asyncio.Lock(唔同 pkg 之間互不影響)。
Return 同 ops 一樣順序嘅 result list。
async def batch_operations(self, ops): # 批量操作(lock per key + gather)
async def run_one(op): # 一個 op 嘅行為
action = op["action"] # 攞 action 類型
pkg_id = op["pkg_id"] # 攞 pkg_id
lock = self.locks[pkg_id] # 攞呢個 pkg 嘅鎖(defaultdict 自動 create)
async with lock: # 同一個 pkg 一次只一個 op
if action == "install": # install 類型
return self.install(pkg_id) # 直接 call sync 版 install
elif action == "uninstall": # uninstall 類型
return self.uninstall(pkg_id) # 直接 call sync 版 uninstall
else: # 其他未知 action
return "unknown action" # 唔 support
tasks = [] # 開個空 list 等陣裝 coroutine
for op in ops: # 逐個 op 砌 coroutine
tasks.append(run_one(op)) # run_one(op) return coroutine,未開始行
results = await asyncio.gather(*tasks) # 拆開 list,gather 同時開動全部
return list(results) # 返結果(同 ops 一樣順序)
def __init__(self, clock=None):
self.packages = {}
self.installed = {}
self.deprecation_times = {}
self.deprecated = {}
self.dependencies = defaultdict(list)
self.dep_history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock) ← L5 加
if clock is not None:
self.clock = clock
else:
self.clock = lambda: time.time() * 1000
self.packages = {"pkg_a": "1.0", ...} 套件登記冊
self.installed = {"pkg_a": True, ...} 裝咗未
self.deprecation_times = {} 過期時間
self.deprecated = {} 已過期標記
self.dependencies = defaultdict(list) 依賴關係
self.dep_history = defaultdict(list) 依賴事件記錄
self.locks = { per-pkg 鎖(L5 加)
"pkg_a": <asyncio.Lock>, defaultdict 一 access 就自動造
"pkg_b": <asyncio.Lock>,
}
defaultdict(asyncio.Lock):第一次 access 自動 new 一把,第二次 access 同一個 key return 同一把。唔同 pkg 嘅 op 互不影響,同 pkg 嘅 op 排隊。
download = 由外部攞 package semaphore = 限制同時幾多個 download fail-fast = 唔合格即走唔入 sem external_call = 模擬外部 API
download_packages(pkg_ids, external_call, max_concurrent):每個 pkg_id 如果註冊咗 + 未裝,就 await external_call(pkg_id) 然後 install。
未註冊 / 已裝 / deprecated → "skipped:<pkg_id>"(fail-fast,唔入 sem)。
成功 → "downloaded:<pkg_id>"。external_call 拋 exception → "error:<pkg_id>:<message>"。
用 asyncio.Semaphore(max_concurrent) 限制同時幾多個下載。
async def download_packages(self, pkg_ids, external_call, max_concurrent): # 批量下載 package;合格先入下載閘,下載完再返本地上鎖 install
semaphore = asyncio.Semaphore(max_concurrent) # 准考證 N 張,限同時幾多個
async def run_one(pkg_id): # 做一個 pkg 嘅下載
# ↓↓↓ Fail-fast check(唔入 sem)↓↓↓
if pkg_id not in self.packages: # 連 app store 都未有
return "skipped:" + pkg_id # 即走,唔佔 sem
if self.installed.get(pkg_id, False) is True: # 已裝
return "skipped:" + pkg_id # 即走
if pkg_id in self.deprecated: # 已過期
return "skipped:" + pkg_id # 即走
# ↑↑↑ Fail-fast check 完,合格先入 sem ↑↑↑
async with semaphore: # 攞准考證(限流)
try: # 可能拋 exception
await external_call(pkg_id) # 由外部下載
lock = self.locks[pkg_id] # 落埋 per-pkg lock 先 install
async with lock: # 鎖住呢個 pkg
install_result = self.install(pkg_id) # 裝落去
if install_result == "installed": # install 成功
return "downloaded:" + pkg_id # 下載 + 裝成功
else: # install 失敗(例如缺 dep)
return "error:" + pkg_id + ":" + install_result # 返錯誤原因
except Exception as e: # 下載中途出事
return "error:" + pkg_id + ":" + str(e) # 返 exception message
tasks = [] # 開個空 list 裝 coroutine
for pkg_id in pkg_ids: # 逐個 pkg 砌 coroutine
tasks.append(run_one(pkg_id)) # 入 list(未跑)
results = await asyncio.gather(*tasks) # 同時跑晒,保留順序
return list(results) # 返結果 list
def __init__(self, clock=None):
同 L5 完全一樣,冇加新 field
semaphore 喺 method 入面開(per-call)
self.packages = {"pkg_a": "1.0", ...} 套件登記冊
self.installed = {"pkg_a": True, ...} 裝咗未
self.deprecation_times = {} 過期時間
self.deprecated = {} 已過期標記
self.dependencies = defaultdict(list) 依賴關係
self.dep_history = defaultdict(list) 依賴事件記錄
self.locks = defaultdict(asyncio.Lock) per-pkg 鎖
semaphore 喺 method 入面開(per-call)
同 L5 一樣,冇加新 instance var。semaphore 喺 download_packages 入面即場 new,每次 call 都開一個新嘅。同時最多 max_concurrent 個 external_call。
街市買賣嘅撮合系統。買家出價,賣家開價,價啱就成交。要寫個 class 模擬。
想像個街市買賣板:
┌───────────────────────────────────────────┐
│ 買家 b1: buy 價 100 數量 5 │
│ 買家 b2: buy 價 102 數量 3 ← 出最高 │
│ 賣家 s1: sell 價 105 數量 4 │
│ 賣家 s2: sell 價 101 數量 6 ← 開最低 │
└───────────────────────────────────────────┘
每張單有:
order_id = 單嘅編號("b1"、"s1")
side = "buy" 或 "sell"
price = 出嘅價
quantity = 想買/賣幾多
remaining = 仲未成交嘅數量
撮合規則:
1. 最高價嘅 buy 對最低價嘅 sell
2. buy.price >= sell.price 先成交
3. 成交數量 = min(buy.remaining, sell.remaining)
4. 成交價用 sell 嘅價
例:上面個板撮合一次
best buy = b2 (價 102,最高)
best sell = s2 (價 101,最低)
buy.price 102 >= sell.price 101 → 可以成交
trade_qty = min(b2.remaining=3, s2.remaining=6) = 3
trade_price = 101(用 sell 價)
成交後:
b2.remaining = 0 → FILLED
s2.remaining = 3 → PARTIAL
trade desc: "trade: b2 x s2 @ 101 qty 3"
再 match 一次:
best buy = b1 (價 100)
best sell = s2 (價 101)
100 < 101 → 唔可以成交 → break
後面 level 加多啲嘢:
L2 加 list_orders:買賣板排序
L3 加 TTL:張單過咗時就 EXPIRED
L4 加 match engine:partial fill + trade history
L5 加 async batch:lock per order_id
L6 加 settle_trades:派去外部結算
import asyncio
import time
from collections import defaultdict
class OrderBook:
def __init__(self, clock=None):
self.orders = {} # L1 所有單
self.timestamp_counter = 0 # L1 確定性排序
self.trades = [] # L4 成交描述
self.trade_details = [] # L4 成交詳情(L6 用)
self.locks = defaultdict(asyncio.Lock) # L5 加
if clock is not None: # L3 加:時鐘
self.clock = clock # 用傳入嘅時鐘
else: # 冇人傳假時鐘入嚟,就退返用真實毫秒時間;落單先會跟住現實時序走
self.clock = lambda: time.time() * 1000 # 預設用真實毫秒時間
self.orders = {
"b1": {"side": "buy", "price": 100.0,
"quantity": 5, "remaining": 5,
"status": "ACTIVE", "timestamp": 1, "expiry": 0},
"s1": {"side": "sell", "price": 105.0,
"quantity": 4, "remaining": 4,
"status": "ACTIVE", "timestamp": 2, "expiry": 0},
}
# 第一層 key = 單嘅編號
# 第二層係個 dict,存呢張單嘅 info
order │ side │ price │ qty │ rem │ status │ ts
───────┼──────┼───────┼─────┼─────┼────────┼────
b1 │ buy │ 100 │ 5 │ 5 │ ACTIVE │ 1
s1 │ sell │ 105 │ 4 │ 4 │ ACTIVE │ 2
self.orders = {} # 上面個 table,開頭係空
self.timestamp_counter = 0 # 寫到第幾張,開頭 0
# 後面 L3 加 expiry / clock
# L4 加 trades / trade_details
# L5 加 locks
# L1 用唔住嗰啲,所以開頭簡單啲
# 攞下一個時間戳(counter 確保順序確定)
def _next_timestamp(self): # L1 用
self.timestamp_counter += 1 # counter 加 1
return self.timestamp_counter # 返回新嘅 timestamp
# 懶人檢查:過咗期嘅單自動標記 EXPIRED
def _process_expiries(self): # L3 用:每個 public method 入面先 call
now = self.clock() # 而家時間
for order_id in self.orders: # 逐張單睇
order = self.orders[order_id] # 攞呢張單嘅 info
# 只有 ACTIVE / PARTIAL 嘅單先會 EXPIRED
if order["status"] != "ACTIVE" and order["status"] != "PARTIAL": # 唔係 ACTIVE/PARTIAL 就跳過
continue # 跳過
expiry = order.get("expiry", 0) # 呢張單嘅 expiry
if expiry > 0 and now >= expiry: # 過咗時間 → EXPIRED
order["status"] = "EXPIRED" # 標記過期
_next_timestamp() — 點解要自己整 counter?
想像如果用 time.time():
兩張單同一毫秒落 → timestamp 一樣
sort 就亂咗(tie-break 唔確定)
用 counter:
每 call 一次 +1,保證每張單 timestamp 不同
sort 出嚟次序一定確定
每個 public method(place_order / cancel_order / get_order / list_orders / match_orders...)
入面第一行都 call self._process_expiries()。即係每次有人用呢個 OrderBook 之前,
先掃一次:有冇邊張單已經過咗 TTL?有就標記做 EXPIRED。
唔需要 background timer 或 thread。lazy = 用嗰陣先 check。
例:
clock 而家係 500
orders = {
"a": {status: ACTIVE, expiry: 0 } ← 冇 TTL,skip
"b": {status: ACTIVE, expiry: 400} ← 400 < 500 → EXPIRED
"c": {status: ACTIVE, expiry: 800} ← 800 > 500 → 仲未過
"d": {status: FILLED, expiry: 100} ← 已 FILLED → skip
}
_process_expiries 之後:
b.status = "EXPIRED"
其餘不變
order = 單 side = "buy" 或 "sell" price = 價 quantity = 數量 status = 而家咩狀態
def place_order(self, order_id, side, price, quantity, ttl_ms=0): # 落單
self._process_expiries() # 先掃過期單
if order_id in self.orders: # 如果張單已經存在,就 return "exists"
return "exists" # 已存在就唔落
order = { # 砌新 order dict
"side": side, # "buy" 或 "sell"
"price": price, # 出嘅價
"quantity": quantity, # 想買/賣幾多
"remaining": quantity, # 開頭 remaining = quantity
"status": "ACTIVE", # 新單狀態
"timestamp": self._next_timestamp(), # 登記時間
"expiry": 0, # L3 用
}
if ttl_ms > 0: # L3:有 TTL 先記過期時間
order["expiry"] = self.clock() + ttl_ms # 到期時間 = 而家 + TTL
self.orders[order_id] = order # 存入訂單簿
return "placed" # 落單成功
def cancel_order(self, order_id): # 取消單(唔刪,只係改 status)
self._process_expiries() # 先掃過期單
if order_id not in self.orders: # 唔存在 → "not found"
return "not found" # 搵唔到
self.orders[order_id]["status"] = "CANCELLED" # 標記 CANCELLED
return "cancelled" # 取消成功
def get_order(self, order_id): # 查單嘅描述
self._process_expiries() # 先掃過期單
if order_id not in self.orders: # 搵唔到呢個 id
return "not found" # 搵唔到
order = self.orders[order_id] # 攞呢張單嘅 dict
# 砌個 string:"id(side price x remaining)[status]"
result = order_id + "(" # 開始砌描述 string
result = result + order["side"] + " " # 加 buy 或 sell
result = result + str(order["price"]) + " x " # 加價錢
result = result + str(order["remaining"]) + ")" # 加剩餘數量
result = result + "[" + order["status"] + "]" # 加狀態標記
return result # 返回完整描述
共通:
- 都係 dict 存 entity(self.orders vs self.tasks)
- 都用 counter 做確定性排序
- 都有 CRUD pattern
唔同:
- OrderBook 有 side("buy"/"sell")— TaskQueue 冇
- OrderBook 有 price + quantity + remaining
TaskQueue 得 priority 一個 number
- OrderBook 嘅 status 開頭 ACTIVE
TaskQueue 開頭 QUEUED
- OrderBook cancel 改 status(唔刪)
TaskQueue 冇 cancel
- OrderBook 返 string:"placed"/"exists"/"cancelled"/"not found"
TaskQueue 返 bool
def __init__(self):
self.orders = {}
self.timestamp_counter = 0
self.orders = { 訂單簿(order_id → info dict)
"b1": {
"side": "buy", 買定賣
"price": 100.0, 出嘅價
"quantity": 5, 原始數量
"remaining": 5, 未成交數量
"status": "ACTIVE", 而家狀態
"timestamp": 1, 登記順序
},
}
self.timestamp_counter = 1 全局計數器
self.trades = [] 成交記錄(L4 先加)
self.trade_details = [] 成交詳情(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-order 鎖(L5 先加)
有人話:「我想落張單,編號 "b1",買 100 蚊 5 件。」
place_order("b1", "buy", 100.0, 5)
# Step 1: check "b1" 喺唔喺 self.orders 入面?
# 冇 → 繼續
# 有 → return "exists"
# Step 2: counter +1 → timestamp_counter = 1
# Step 3: self.orders["b1"] = {...}
# Step 4: return "placed"
cancel_order("b1")
# self.orders["b1"]["status"] = "CANCELLED"
# return "cancelled"
# 注:張單仲喺 self.orders 入面!只係 status 變咗
self.orders["b1"] = {
side="buy", price=100.0, remaining=5, status="ACTIVE", ...
}
get_order("b1") 砌:
"b1" + "(" = "b1("
+ "buy" + " " = "b1(buy "
+ "100.0" + " x " = "b1(buy 100.0 x "
+ "5" + ")" = "b1(buy 100.0 x 5)"
+ "[" + "ACTIVE" + "]" = "b1(buy 100.0 x 5)[ACTIVE]"
用 remaining 唔係 quantity(match 後會減少)
Status 有 5 款:
ACTIVE / CANCELLED / FILLED / PARTIAL / EXPIRED
place_order("b1", "buy", 100.0, 5) → "placed"
place_order("s1", "sell", 105.0, 3) → "placed"
place_order("b1", "buy", 99.0, 2) → "exists"
get_order("b1") → "b1(buy 100.0 x 5)[ACTIVE]"
get_order("s1") → "s1(sell 105.0 x 3)[ACTIVE]"
get_order("x") → "not found"
cancel_order("b1") → "cancelled"
get_order("b1") → "b1(buy 100.0 x 5)[CANCELLED]"
list_orders = 列出買賣板 buy 排價高先 sell 排價低先 tie 用 timestamp 排
def list_orders(self, side): # 列出某邊嘅 ACTIVE 單
self._process_expiries() # 先掃過期單
active = [] # 用嚟裝符合條件嘅單
for order_id in self.orders: # 逐張單睇
order = self.orders[order_id] # 攞呢張單嘅 info
if order["side"] != side: # side 唔啱跳過
continue # 跳過
# 只列 ACTIVE 同 PARTIAL(已成交一半嘅都仲喺度排住隊)
if order["status"] != "ACTIVE" and order["status"] != "PARTIAL": # 唔係 ACTIVE/PARTIAL 就跳過
continue # 跳過
active.append((order_id, order)) # 加入候選清單
if len(active) == 0: # 冇單 → 返 ""
return "" # 冇合格嘅單
if side == "buy": # 買盤排序方向
# 買盤:價高優先 → 用 -price;同價就 timestamp 細嘅先
active.sort(key=lambda x: (-x[1]["price"], x[1]["timestamp"])) # 價高 + 先到排前
elif side == "sell": # 賣盤排序方向
# 賣盤:價低優先 → 用 price;同價就 timestamp 細嘅先
active.sort(key=lambda x: (x[1]["price"], x[1]["timestamp"])) # 價低 + 先到排前
parts = [] # 收集格式化結果
for order_id, order in active: # 逐張格式化
# Format: "id(price x remaining)"
entry = order_id + "(" + str(order["price"]) + "x" + str(order["remaining"]) + ")" # 砌 "id(price x qty)" string
parts.append(entry) # 將呢項塞入 list,留待之後一齊處理或回傳
return ", ".join(parts) # 用逗號接埋
共通:
- 都用 sort key tuple 做排序
- 都用負號 trick:大嘅排先就用 -value
- 都 filter 一個 status
唔同:
- OrderBook 一個 method (list_orders) 處理兩邊
sort key 按 side 決定方向
TaskQueue L2 係 status lifecycle,唔係 sort
- OrderBook 返「格式化 string」,TaskQueue 返 list
- OrderBook 接受 PARTIAL(已成交一半仲喺度排)
TaskQueue 只要 QUEUED
def __init__(self):
self.orders = {}
self.timestamp_counter = 0
self.orders = { 訂單簿(同 L1)
"b1": {
"side": "buy", 買定賣
"price": 100.0, 出嘅價
"quantity": 5, 原始數量
"remaining": 5, 未成交數量
"status": "ACTIVE", 而家狀態
"timestamp": 1, 登記順序
},
}
self.timestamp_counter = 1 全局計數器
self.trades = [] 成交記錄(L4 先加)
self.trade_details = [] 成交詳情(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-order 鎖(L5 先加)
buy: (-price, timestamp)
-price 細 = price 大 = 排先(最高價最有吸引力)
timestamp 細 = 先到 = tie 嗰陣排先
sell: (price, timestamp)
price 細 = 排先(最低價最有吸引力)
timestamp 細 = 先到 = tie 嗰陣排先
self.orders = {
"b1": {side="buy", price=100, remaining=5, status="ACTIVE", ts=1},
"b2": {side="buy", price=102, remaining=3, status="ACTIVE", ts=2},
"b3": {side="buy", price=100, remaining=4, status="ACTIVE", ts=3},
"s1": {side="sell", price=105, remaining=4, status="ACTIVE", ts=4},
"b4": {side="buy", price=99, remaining=2, status="CANCELLED", ts=5},
}
list_orders("buy"):
# Step 1: filter side="buy" + (ACTIVE or PARTIAL)
# b1 ✅ b2 ✅ b3 ✅ s1 ❌(sell) b4 ❌(CANCELLED)
# active = [b1, b2, b3]
# Step 2: 計每張單 sort key (-price, timestamp)
# b1 → (-100, 1)
# b2 → (-102, 2) ← -102 最細 → 排第一
# b3 → (-100, 3)
# Step 3: sort 後 → [b2, b1, b3]
# b1 同 b3 -price 一樣 (-100),b1.ts=1 細過 b3.ts=3 → b1 排先
# Step 4: 砌 string
# "b2(102x3), b1(100x5), b3(100x4)"
list_orders("sell"):
# 假設只有 s1=105, s2=101, s3=101 (ts: 4, 5, 6)
# sort key (price, timestamp)
# s1 → (105, 4)
# s2 → (101, 5) ← 101 最細 → 排第一
# s3 → (101, 6)
# sort 後 → [s2, s3, s1]
# return "s2(101x...), s3(101x...), s1(105x...)"
# 冇 sell 單 → return ""
TTL = time-to-live expiry = 過期時間 lazy = 用嗰陣先掃 clock = 時鐘函數
# ─────────── place_order L3 改版 ───────────
# 加多一個參數 ttl_ms,如果 > 0 就記低過期時間
def place_order(self, order_id, side, price, quantity, ttl_ms=0): # 落單(買或賣)
self._process_expiries() # 落單前先掃 expiry
if order_id in self.orders: # 已存在?
return "exists" # 已存在就唔落
order = { # 砌新 order dict
"side": side, # 買定賣
"price": price, # 出嘅價
"quantity": quantity, # 想買/賣幾多
"remaining": quantity, # 開始 remaining = quantity
"status": "ACTIVE", # 新單狀態
"timestamp": self._next_timestamp(), # 登記時間
"expiry": 0, # default 冇 expiry
}
# ↓↓↓ L3 新加:有 TTL 先記過期時間 ↓↓↓
if ttl_ms > 0: # 有 TTL 先記 expiry
order["expiry"] = self.clock() + ttl_ms # 而家 + TTL = 過期嗰刻
# ↑↑↑ L3 新加完 ↑↑↑
self.orders[order_id] = order # 存入訂單簿
return "placed" # 落單成功
# ─────────── _process_expiries(helper 已展示)───────────
# 預設放喺 method 開頭;但更準確講,係所有 spec 要「見到最新 lazy effect」嘅入口都要 refresh:
# place_order / cancel_order / get_order / list_orders / match_orders / ...
def _process_expiries(self): # 清走過期嘅訂單(lazy)
now = self.clock() # 攞而家時間
for order_id in self.orders: # 逐張單掃
order = self.orders[order_id] # 攞呢張單嘅 info
if order["status"] != "ACTIVE" and order["status"] != "PARTIAL": # 唔係 ACTIVE/PARTIAL 就跳過
continue # 已 FILLED/CANCELLED → skip
expiry = order.get("expiry", 0) # 呢張單嘅 expiry
if expiry > 0 and now >= expiry: # 過咗時 → EXPIRED
order["status"] = "EXPIRED" # 標記過期
共通:
- 都係 status lifecycle 加新狀態
- 都係 lazy helper:每次有人用之前先掃一次
- 都要 inject 時間(OrderBook clock / TaskQueue timestamp 參數)
唔同:
- OrderBook L3 = TTL Expiry(過期就 EXPIRED,唔再做嘢)
TaskQueue L3 = Retry(失敗再試,仲有得救)
- OrderBook 用 self.clock()(callable)
TaskQueue 用 method 參數傳入 timestamp
- OrderBook EXPIRED 係終點
TaskQueue RETRY_SCHEDULED 可以變返 QUEUED
- OrderBook L3 改 place_order signature 加 ttl_ms
TaskQueue L3 加新 method configure_retry
ACTIVE 落咗單,等緊 match
PARTIAL 部分成交(L4 引入)
FILLED 完全成交(L4 引入)
CANCELLED 畀人 cancel
EXPIRED ← L3 新加:過咗 TTL
def __init__(self, clock=None):
self.orders = {}
self.timestamp_counter = 0
if clock is not None: ← L3 加
self.clock = clock
else:
self.clock = lambda: time.time() * 1000
# clock 係 callable,每次 call 就返而家係幾時(ms)
# Test 嗰陣可以 inject fake clock 控制時間
self.orders = { 訂單簿(order_id → info dict)
"b1": {
"side": "buy", 買定賣
"price": 100.0, 出嘅價
"quantity": 5, 原始數量
"remaining": 5, 未成交數量
"status": "ACTIVE", 而家狀態
"timestamp": 1, 登記順序
"expiry": 0, 冇 TTL(L3 加)
},
"b2": {
"side": "buy",
"price": 102.0,
"quantity": 3,
"remaining": 3,
"status": "ACTIVE",
"timestamp": 2,
"expiry": 500, 500ms 過期(L3 加)
},
}
self.timestamp_counter = 2 全局計數器
self.trades = [] 成交記錄(L4 先加)
self.trade_details = [] 成交詳情(L4 先加)
self.locks = defaultdict(asyncio.Lock) per-order 鎖(L5 先加)
fake_time = 100
clock = lambda: fake_time
ob = OrderBook(clock=clock)
# t=100: 落單,TTL = 200ms
ob.place_order("b1", "buy", 100.0, 5, ttl_ms=200)
# self.orders["b1"]["expiry"] = 100 + 200 = 300
# t=250: 仲未過期
fake_time = 250
ob.get_order("b1")
# _process_expiries 入面:
# now = 250
# b1.expiry = 300 → 250 < 300 → 唔過期
# → "b1(buy 100.0 x 5)[ACTIVE]"
# t=350: 過咗期
fake_time = 350
ob.get_order("b1")
# _process_expiries 入面:
# now = 350
# b1.expiry = 300 → 350 >= 300 → 過期
# b1.status = "EXPIRED"
# → "b1(buy 100.0 x 5)[EXPIRED]"
冇 background thread 自動 check expiry
所以要喺「有人用呢個 OrderBook」嗰陣先掃
lazy = 慳 CPU,又保證所有讀寫都睇到最新狀態
place_order: 落單前掃,避免成交咗過期單
match_orders: match 前掃,過期嘅唔好揀入嚟
get_order: 畀人查嗰陣返最新 status
list_orders: 返 active list 前掃
match = 撮合 best buy/sell = 出最有吸引力嗰張 partial fill = 部分成交 trade = 成交記錄
def match_orders(self): # 撮合引擎;不停由買盤同賣盤各揀最靚嗰張,直到再配唔成對
self._process_expiries() # 先掃過期單
new_trades = [] # 今次 call 撮合到嘅 trade
while True: # 不停撮合直到冇得 match
# 第一步:喺全場買盤入面揀「最想買」嗰位客人
best_buy_id = None # 準備搵最佳買家
best_buy = None # 暫時冇候選人
for order_id in self.orders: # 逐張單掃
order = self.orders[order_id] # 攞呢張單嘅 info
if order["side"] != "buy": continue # 唔係 buy → 跳過
# 只考慮 ACTIVE 或 PARTIAL(PARTIAL 嘅 remaining 仲有貨)
if order["status"] != "ACTIVE" and order["status"] != "PARTIAL": # 唔係 ACTIVE/PARTIAL 就跳過
continue # 跳過
if order["remaining"] <= 0: continue # 冇剩貨 → 跳過
if best_buy is None: # 第一個候選
best_buy_id = order_id # 更新最佳買家
best_buy = order # 記住呢張單
else: # 已經有候選單;而家要比較新嚟呢張係咪更值得排第一
if order["price"] > best_buy["price"]: # 價高就贏
best_buy_id = order_id # 更新最佳買家
best_buy = order # 記住呢張單
elif order["price"] == best_buy["price"]: # 同價比時間
if order["timestamp"] < best_buy["timestamp"]: # 同價,早嘅贏
best_buy_id = order_id # 更新最佳買家
best_buy = order # 記住呢張單
if best_buy is None: break # 冇 buy 可以 match
# 第二步:喺全場賣盤入面揀「最願意平賣」嗰位店主
best_sell_id = None # 準備搵最佳賣家
best_sell = None # 暫時冇候選人
for order_id in self.orders: # 逐張單掃
order = self.orders[order_id] # 攞呢張單嘅 info
if order["side"] != "sell": continue # 唔係 sell → 跳過
if order["status"] != "ACTIVE" and order["status"] != "PARTIAL": # 唔係 ACTIVE/PARTIAL 就跳過
continue # 跳過
if order["remaining"] <= 0: continue # 冇剩貨 → 跳過
if best_sell is None: # 第一個候選
best_sell_id = order_id # 更新最佳賣家
best_sell = order # 記住呢張單
else: # 已經有候選單;而家要比較新嚟呢張係咪更值得排第一
if order["price"] < best_sell["price"]: # 價低就贏
best_sell_id = order_id # 更新最佳賣家
best_sell = order # 記住呢張單
elif order["price"] == best_sell["price"]: # 同價比時間
if order["timestamp"] < best_sell["timestamp"]: # 更早落單 → 優先
best_sell_id = order_id # 更新最佳賣家
best_sell = order # 記住呢張單
if best_sell is None: break # 冇 sell 可以 match
# 第三步:檢查價錢有冇 overlap;買家出唔夠價,就成場即刻暫停
if best_buy["price"] < best_sell["price"]: # 買價低過賣價?
break # 退出 loop
# 第四步:一成交就用較細嗰邊 remaining 當成交量;因為細嗰邊會先食晒
trade_qty = best_buy["remaining"] # 先攞買方嘅 remaining
if best_sell["remaining"] < trade_qty: # 賣方更少?
trade_qty = best_sell["remaining"] # 成交量 = 少嗰邊
trade_price = best_sell["price"] # 成交價用賣方價
# 第五步:扣走雙方已成交數量,更新枱面上仲剩幾多貨
best_buy["remaining"] = best_buy["remaining"] - trade_qty # 買方扣數量
best_sell["remaining"] = best_sell["remaining"] - trade_qty # 賣方扣數量
# 第六步:睇邊邊已經食晒,食晒就 FILLED,未食晒就 PARTIAL 留喺場上等下一手
if best_buy["remaining"] == 0: # 買方全部成交?
best_buy["status"] = "FILLED" # 全部成交
else: # 買家仲未食晒;留返 PARTIAL 喺場上等下一手賣盤再接
best_buy["status"] = "PARTIAL" # 只成交咗一部分
if best_sell["remaining"] == 0: # 賣方全部成交?
best_sell["status"] = "FILLED" # 全部成交
else: # 賣家仲有貨未賣清;留返 PARTIAL 喺場上等下一位買家
best_sell["status"] = "PARTIAL" # 只成交咗一部分
# 第七步:將今次成交寫成描述同 detail;之後 L6 結算會直接跟住呢份單去做
desc = "trade: " + best_buy_id + " x " + best_sell_id # 砌成交描述
desc = desc + " @ " + str(trade_price) # 加成交價
desc = desc + " qty " + str(trade_qty) # 加成交數量
new_trades.append(desc) # 加入今次結果
self.trades.append(desc) # 加入全局歷史
# 記埋 detail 畀 L6 settle_trades 用
self.trade_details.append({ # 記詳情畀 L6 用
"buyer_id": best_buy_id, # 買方 id
"seller_id": best_sell_id, # 賣方 id
"price": trade_price, # 成交價
"quantity": trade_qty, # 成交數量
})
return new_trades # 返回今次撮合結果
def get_trade_history(self): # 攞所有成交記錄
self._process_expiries() # 先掃過期單
result = [] # 準備裝結果
for trade in self.trades: # 逐條成交記錄
result.append(trade) # 將呢項塞入 list,留待之後一齊處理或回傳
return result # 將整理好嘅結果交返 caller;外面就拎住呢份清單或報表去用
共通:
- 都係 L4 加最大嗰個新 pattern
- 都要 scan self.entities 揀符合條件嘅
唔同:
- OrderBook L4 = MATCH ENGINE(完全新 pattern)
要兩邊揀(buy 同 sell),夾埋計成交數量
- TaskQueue L4 = Dependencies(DAG)
只係多咗 deps check
- OrderBook 有 partial fill:一張單可以拆幾次成交
- OrderBook 有 trade history(self.trades)
TaskQueue 冇歷史
def __init__(self, clock=None):
self.orders = {}
self.timestamp_counter = 0
self.trades = [] ← L4 加:成交描述 list
self.trade_details = [] ← L4 加:成交詳情(buyer/seller/price/qty)
self.clock = ...
self.orders = { 訂單簿(match 之後)
"b1": {
"side": "buy",
"price": 100,
"quantity": 5, 原始數量
"remaining": 2, match 咗 3 之後
"status": "PARTIAL", 仲有貨,部分成交
"timestamp": 1,
"expiry": 0,
},
"s1": {
"side": "sell",
"price": 100,
"quantity": 3,
"remaining": 0, 全部賣晒
"status": "FILLED", 終點
"timestamp": 2,
"expiry": 0,
},
}
self.timestamp_counter = 2 全局計數器
self.trades = [ 成交描述(L4 加)
"trade: b1 x s1 @ 100 qty 3",
]
self.trade_details = [ 成交詳情(L4 加,L6 settle 用)
{"buyer_id": "b1", "seller_id": "s1", "price": 100, "quantity": 3},
]
self.locks = defaultdict(asyncio.Lock) per-order 鎖(L5 先加)
while True:
1. 搵 best buy(價最高,同價先到優先)
2. 搵 best sell(價最低,同價先到優先)
3. 任一邊冇 → break
4. buy.price < sell.price → break(價對唔上)
5. trade_qty = min(buy.rem, sell.rem)
6. 兩邊都扣 trade_qty
7. remaining == 0 → FILLED,否則 PARTIAL
8. 記低 trade
9. 返去 loop 開頭(可能仲有得撮合)
# 每次 loop 至少 fill 一邊(trade_qty = min)
# 所以 loop 最多 N 次就停(N = order 數量)
# 開頭 4 張單:
self.orders = {
"b1": {side=buy, price=102, remaining=5, status=ACTIVE, ts=1},
"b2": {side=buy, price=100, remaining=3, status=ACTIVE, ts=2},
"s1": {side=sell, price=101, remaining=2, status=ACTIVE, ts=3},
"s2": {side=sell, price=103, remaining=4, status=ACTIVE, ts=4},
}
# Loop 1:
# best buy = b1 (102 最高)
# best sell = s1 (101 最低)
# 102 >= 101 ✅
# trade_qty = min(5, 2) = 2
# trade_price = 101 (sell 價)
# b1.remaining = 3 → PARTIAL
# s1.remaining = 0 → FILLED
# trade: "trade: b1 x s1 @ 101 qty 2"
# Loop 2:
# best buy = b1 (102 仲喺度,PARTIAL,remaining=3)
# best sell = s2 (s1 FILLED 唔再考慮,剩 s2)
# 102 < 103 ❌ → break
# match_orders() return ["trade: b1 x s1 @ 101 qty 2"]
# self.trades = ["trade: b1 x s1 @ 101 qty 2"]
# 之後 get_order("b1") → "b1(buy 102 x 3)[PARTIAL]"
# 之後 get_order("s1") → "s1(sell 101 x 0)[FILLED]"
街市常規:賣家開咩價就咩價成交
例如 sell 開 101,buy 願出 102(出多咗都肯買)
成交價 101 → 買家賺
反過嚟 buy 出 102,sell 開 100(情願平啲賣)
成交價都係 sell 嘅 100 → 買家賺更多
self.trades = ["trade: b1 x s1 @ 101 qty 2"]
只係 string,難 parse 返
self.trade_details = [{"buyer_id": "b1", "seller_id": "s1", ...}]
L6 settle_trades 要用 buyer_id / seller_id 查兩邊 status
方便先記埋 dict
async = 一齊跑 batch = 一次過落多張單 lock per order_id = 每張單獨立鎖 gather = 同時跑
async def batch_place_orders(self, orders): # 一次過落多張單
async def run_one(order_data): # 落一張單嘅內部函數
oid = order_data["order_id"] # 攞 order id
side = order_data["side"] # 攞 side
price = order_data["price"] # 攞 price
qty = order_data["quantity"] # 攞 quantity
lock = self.locks[oid] # 攞呢張單嘅鎖(每張單獨立)
async with lock: # 攞到先入
return self.place_order(oid, side, price, qty) # 真正落單
tasks = [] # 收集所有 coroutine
for order_data in orders: # 逐個 order 砌 coroutine
tasks.append(run_one(order_data)) # 加一個落單任務
results = await asyncio.gather(*tasks) # 同時跑全部,等晒佢哋完
return list(results) # 返回全部結果
共通:
- 都係 async + gather pattern
- 都用 asyncio.Lock 保護 shared state
唔同:
- OrderBook L5 = batch_place_orders(每個 op 做一次就完)
用 Gather pattern:N 個 coroutine,每個做一次
Lock per order_id(同一張單嘅 op 唔好撞)
- TaskQueue L5 = run_workers(worker pool,循環攞工作)
用 Worker Pool pattern:N 個 worker while True loop
一個全局 Lock 保護 queue access
- OrderBook 用 defaultdict(asyncio.Lock)
TaskQueue 用一個 self._lock
def __init__(self, clock=None):
self.orders = {}
self.timestamp_counter = 0
self.trades = []
self.trade_details = []
self.locks = defaultdict(asyncio.Lock) ← L5 加
self.clock = ...
# defaultdict(asyncio.Lock) 嘅意思:
# self.locks["b1"] ← 第一次 access:自動建一個新 asyncio.Lock()
# self.locks["b1"] ← 第二次 access:返同一個 Lock 對象
# self.locks["b2"] ← 另一個 key,建另一個獨立 Lock
想像兩個 coroutine 同時想落「同一張」單:
coroutine A: place_order("b1", "buy", 100, 5)
coroutine B: place_order("b1", "buy", 200, 3) ← 同一 id!
冇 lock:
A: check "b1" not in orders → 過關
B: check "b1" not in orders → 過關(A 仲未寫入!)
A: 寫入 self.orders["b1"] = {...100, 5...}
B: 寫入 self.orders["b1"] = {...200, 3...} ← 覆蓋咗 A
兩個都 return "placed" → BUG(其實應該一個 "exists")
有 lock per order_id:
A: 攞 self.locks["b1"] → 進入
check + write + return "placed"
B: 等緊 self.locks["b1"]
A 完 → B 進入
check → 已存在 → return "exists"
✅ 正確
全局 lock:任何時間只有 1 個 op 喺度做
batch 100 張單 → 串行做 → 完全冇 concurrency
Per-order_id lock:
"b1" 同 "b2" 嘅 op 可以同時做(鎖唔同)
只有同一 id 嘅先 serialize
maximum concurrency under correctness constraint
orders = [
{"order_id": "b1", "side": "buy", "price": 100, "quantity": 5},
{"order_id": "s1", "side": "sell", "price": 105, "quantity": 3},
{"order_id": "b1", "side": "buy", "price": 200, "quantity": 1}, # 重複 id
]
await ob.batch_place_orders(orders)
# 砌 3 個 coroutine(未行)
# gather 同時開動:
# T0:
# C0 攞 locks["b1"] ✅,C1 攞 locks["s1"] ✅
# C2 想攞 locks["b1"],但 C0 仲未還 → 等
# T0+:
# C0: place_order("b1", ...) → 寫入 → return "placed"
# C1: place_order("s1", ...) → 寫入 → return "placed"
# T1:
# C0 / C1 還鎖
# C2 攞到 locks["b1"]
# C2: place_order("b1", ...) → 已存在 → return "exists"
# results = ["placed", "placed", "exists"]
# 注:return 順序同 input 順序一致(gather 保證)
settle = 結算 external_call = 外部 API semaphore = 限制同時幾多個 fail-fast = 唔合格即走
async def settle_trades(self, trade_indices, external_call, max_concurrent): # 批量幫成交單做外部結算;先驗單,再入閘,再 call 外部
semaphore = asyncio.Semaphore(max_concurrent) # 限制同時幾多個結算
async def run_one(index): # 處理單一 item
async with semaphore: # 入 sem
# ↓↓↓ Fail-fast:index 範圍 ↓↓↓
if index < 0 or index >= len(self.trade_details): # index 出界?
return "skipped:" + str(index) # 跳過呢個
detail = self.trade_details[index] # 攞成交詳情
buyer_id = detail["buyer_id"] # 買方 id
seller_id = detail["seller_id"] # 賣方 id
# ↓↓↓ Fail-fast:兩邊 order 任一 CANCELLED → skip ↓↓↓
buyer = self.orders.get(buyer_id, None) # 查買方 order
seller = self.orders.get(seller_id, None) # 查賣方 order
if buyer is None or seller is None: # 任一方唔存在
return "skipped:" + str(index) # 跳過呢個
if buyer["status"] == "CANCELLED" or seller["status"] == "CANCELLED": # 任一方已取消
return "skipped:" + str(index) # 跳過呢個
# ↓↓↓ 合格:call 外部,try/except 包住 ↓↓↓
try: # 嘗試 call 外部
trade_desc = self.trades[index] # 攞成交描述 string
await external_call(trade_desc) # call 外部結算 API
return "settled:" + str(index) # 結算成功
except Exception as e: # 外部 call 失敗
return "error:" + str(index) + ":" + str(e) # 返回錯誤 + index
tasks = [] # 收集所有 coroutine
for index in trade_indices: # 逐個 trade index
tasks.append(run_one(index)) # 加一個結算任務
results = await asyncio.gather(*tasks) # 並發跑全部
return list(results) # 返回全部結果
共通:
- 都用 asyncio.Semaphore(max_concurrent) 限制並發
- 都用 gather 同時開動全部
- 都接受 external_call function
唔同:
- OrderBook 用 trade INDEX(int),唔係 id
TaskQueue 用 task_id(string)
- OrderBook 要 check 兩邊 order(buyer + seller)
TaskQueue 只 check 一個 task
- OrderBook 有 try/except → "error:index:msg"
TaskQueue L6 冇 try/except
- OrderBook 3 種 return:settled / skipped / error
TaskQueue L6 得 boolean(True / False)
撮合到嘅 trade 要派去外部結算系統。但外部有 rate limit, 要限制同時最多 N 個結算緊。Trade 用 index 嚟指定(唔係 id)。 任一方畀人 cancel 咗就 skip。外部 call 失敗就返 error。
def __init__(self, clock=None):
self.orders = {}
self.timestamp_counter = 0
self.trades = []
self.trade_details = []
self.locks = defaultdict(asyncio.Lock)
self.clock = ...
# L6 冇加新 field,用 L4 嘅 trades / trade_details
"settled:0" → 結算成功
"skipped:1" → index 出界 / 任一方 CANCELLED
"error:2:msg" → external_call 拋 exception
# 假設之前 match_orders 已經產生 3 個 trade:
self.trades = [
"trade: b1 x s1 @ 100 qty 2",
"trade: b2 x s2 @ 101 qty 3",
"trade: b3 x s3 @ 102 qty 1",
]
self.trade_details = [
{"buyer_id": "b1", "seller_id": "s1", ...},
{"buyer_id": "b2", "seller_id": "s2", ...},
{"buyer_id": "b3", "seller_id": "s3", ...},
]
# 跟住 cancel 咗 s2:
ob.cancel_order("s2")
# self.orders["s2"]["status"] = "CANCELLED"
# 而家 settle:
async def fake_call(desc):
if "b3" in desc:
raise RuntimeError("api down")
await ob.settle_trades([0, 1, 2, 99], fake_call, max_concurrent=2)
# Index 0:
# detail = (b1, s1),兩邊都 ACTIVE/FILLED 等等(唔 cancel)
# external_call("trade: b1 x s1 ...") OK
# → "settled:0"
#
# Index 1:
# detail = (b2, s2)
# s2.status == "CANCELLED" → skip
# → "skipped:1"
#
# Index 2:
# detail = (b3, s3),兩邊 OK
# external_call raises RuntimeError("api down")
# → "error:2:api down"
#
# Index 99:
# 99 >= len(trade_details)=3 → 出界
# → "skipped:99"
# results = ["settled:0", "skipped:1", "error:2:api down", "skipped:99"]
external_call 係外部 function(可能 raise 任何 exception)
如果唔 catch:
gather 入面有一個 raise → 整個 gather 失敗
其他成功嘅 result 都收唔到
Catch 住變 "error:N:msg" string:
每個 trade 獨立 report 結果
一個壞,其他繼續
L5 用 asyncio.Lock per id:
每個 order_id 一個 lock,一次得 1 個
L6 用 asyncio.Semaphore(N):
一個 sem,一次最多 N 個 coroutine 入到去
超過 N 個想入 → 排隊等
# 想像:
# Lock = 1 個 toilet(一次 1 個人)
# Semaphore(3) = 3 個 toilet(一次最多 3 個人)
未整。accumulate pattern、warehouse transfer、computed metric sort。
YouTube 頻道。Channel(topic)有人 subscribe,up主出新片(publish),subscriber 收到通知。每個 subscriber 自己記住睇到第幾條(consumer offset)。
想像 YouTube:
┌─────────────────────────────────────────┐
│ Topic: "cooking" │
│ subscribers: {alice, bob} │
│ messages: [msg_1, msg_2, msg_3] │
├─────────────────────────────────────────┤
│ Topic: "gaming" │
│ subscribers: {bob, carol} │
│ messages: [msg_4] │
└─────────────────────────────────────────┘
每個 (topic, user) 配對有個 offset:
("cooking", "alice") → 2 ← alice 喺 cooking 睇到第 2 條
("cooking", "bob") → 0 ← bob 一條都未睇
("gaming", "bob") → 1 ← bob 喺 gaming 睇晒
後面 level 加多啲嘢:
L1 create_topic / subscribe / publish / get_message
L2 top_topics(按 msg count 排)+ list_subscribers
L3 加 TTL:太舊嘅 message 過期
L4 consume():返回下一條未讀,offset++
L5 batch_publish / batch_consume,lock per topic
L6 push_notifications — 所有 subscriber 都要 push,
用 Semaphore 限並發,try/except 唔 skip 任何人
import time
import asyncio
from collections import defaultdict
class PubSubSystem:
def __init__(self, ttl_ms=None):
self.topics = {} # L1 所有 topic
self.offsets = {} # L1 (topic, user) → offset
self.msg_counter = 0 # L1 全局 msg 編號
self.ttl_ms = ttl_ms # L3 加:TTL 毫秒
self.locks = defaultdict(asyncio.Lock) # L5 加:每個 topic 一把鎖
self.topics = {
"cooking": {
"subscribers": {"alice", "bob"},
"messages": [
{"id": "msg_1", "content": "hello", "timestamp": 1000.0},
{"id": "msg_2", "content": "wow", "timestamp": 2000.0},
],
},
"gaming": {
"subscribers": {"bob"},
"messages": [],
},
}
# 第一層 key = topic_id
# 第二層 inner dict 有 subscribers (set) + messages (list)
self.offsets = {
("cooking", "alice"): 2, ← alice 喺 cooking 已讀 2 條
("cooking", "bob"): 0, ← bob 一條都未讀
("gaming", "bob"): 0,
}
# Key = (topic_id, user_id) tuple
# Value = 下一條未讀 message 嘅 index
self.topics = {} # 上面個 topic dict
self.offsets = {} # 上面個 offset dict
self.msg_counter = 0 # 全局 msg counter
# L3 加 self.ttl_ms(過期時間)
# L5 加 self.locks(async lock)
Multi-collection pattern。Topic 嘅 messages 屬於 topic 本身,但 offset 屬於 (topic, user) 配對。同一個 user subscribe 多個 topic,每個 topic 嘅進度唔同,所以 offset 要 keyed by tuple。
def _now_ms(self): # 攞當前時間(毫秒)
return time.time() * 1000 # 返回毫秒 timestamp
def _purge_expired(self, topic_id): # L3 用:清走呢個 topic 嘅過期 message
if self.ttl_ms is None: # 冇設 TTL → 唔使清
return # 唔使清
if topic_id not in self.topics: # Topic 唔存在 → 直接走
return # Topic 唔存在直接走
now = self._now_ms() # 攞而家時間
valid = [] # 收集未過期嘅 msg
for msg in self.topics[topic_id]["messages"]: # 逐條 message 睇
age = now - msg["timestamp"] # 算佢有幾舊
if age <= self.ttl_ms: # 仲未過期 → 留低
valid.append(msg) # 呢條仲有效,留低
self.topics[topic_id]["messages"] = valid # 用新 list 取代舊嘅
time.time() 返回秒(float),乘 1000 變毫秒
1700000000.5 → 1700000000500.0
Lazy = 唔係定時清,係讀之前先清。任何 read operation(get_message / top_topics / consume)入面第一句就 call 呢個 helper。
例:ttl_ms = 5000(5 秒過期)
now = 10000
messages = [
{"id": "msg_1", "timestamp": 3000, ...}, ← age=7000 → 過期
{"id": "msg_2", "timestamp": 6000, ...}, ← age=4000 → OK
{"id": "msg_3", "timestamp": 9000, ...}, ← age=1000 → OK
]
purge 後 → 淨返 [msg_2, msg_3]
注意:list 縮短咗,offset 可能 point 到過晒嘅位
topic = 頻道 subscribe = 訂閱 publish = 出片 msg_id = 條 message 嘅編號("msg_1")
def create_topic(self, topic_id): # 開新 channel
if topic_id in self.topics: # 已存在 → return False
return False # 已存在
self.topics[topic_id] = { # 新 topic:空 subscribers + 空 messages
"subscribers": set(), # 空嘅訂閱者 set
"messages": [], # 空嘅 message list
}
return True # 成功就返 True;caller 可以當今次動作真係做咗
def subscribe(self, topic_id, user_id): # User 訂閱 topic
if topic_id not in self.topics: # Topic 唔存在 → False
return False # Topic 唔存在
if user_id in self.topics[topic_id]["subscribers"]: # 已 subscribe → False
return False # 已 subscribe 咗
self.topics[topic_id]["subscribers"].add(user_id) # 加入訂閱者 set
# offset 初始化為而家 messages 長度 → 新 sub 唔收舊 msg
self.offsets[(topic_id, user_id)] = len(self.topics[topic_id]["messages"]) # offset 由而家開始(唔收舊 msg)
return True # 成功就返 True;caller 可以當今次動作真係做咗
def publish(self, topic_id, message): # up主出新片
if topic_id not in self.topics: # Topic 唔存在 → None
return None # 搵唔到
self.msg_counter += 1 # 全局 counter +1
msg_id = "msg_" + str(self.msg_counter) # "msg_1" 咁樣
msg_obj = { # 砌 message dict
"id": msg_id, # message 編號
"content": message, # 內容
"timestamp": self._now_ms(), # L3 用:記低出片時間
}
self.topics[topic_id]["messages"].append(msg_obj) # 加入 topic 嘅 message list
return msg_id # 返回新 msg 編號
def get_message(self, topic_id, msg_id): # 按 msg_id 攞 message 內容
self._purge_expired(topic_id) # L3 用:先清過期
if topic_id not in self.topics: # Topic 唔存在?
return None # 搵唔到
for msg in self.topics[topic_id]["messages"]: # 逐條 message 睇
if msg["id"] == msg_id: # 搵到目標 msg?
return msg["content"] # 返回內容
return None # 搵唔到
def __init__(self):
self.topics = {}
self.offsets = {}
self.msg_counter = 0
self.topics = { 頻道目錄(topic_id → info dict)
"cooking": {
"subscribers": {"alice"}, 訂閱者 set
"messages": [ message list(按 publish 順序)
{"id": "msg_1", "content": "hello",
"timestamp": 1700000000000.0},
],
},
}
self.offsets = { 讀進度((topic, user) → index)
("cooking", "alice"): 1, alice 讀到第 1 條
}
self.msg_counter = 1 全局 msg 編號
self.ttl_ms = None message 過期時間(L3 先加)
self.locks = defaultdict(asyncio.Lock) per-topic 鎖(L5 先加)
如果呢個 topic 已經存在,就 return False。否則加入 self.topics,subscribers 開頭係空 set,messages 開頭係空 list。
假設 cooking 已經有 5 條 message
alice subscribe:
offset = len(messages) = 5
即係 alice 嘅 offset 由 index 5 開始
→ 之前嘅 msg_1..msg_5 alice 收唔到
→ 之後 publish 嘅 msg_6 先係 alice 第一條未讀
點解?YouTube 邏輯:你今日 subscribe,
唔會突然收到 up主 3 年前出嘅片通知
重點:msg_counter 係全局,唔係 per-topic
publish("cooking", "x") → "msg_1"
publish("gaming", "y") → "msg_2" ← 唔係 msg_1
publish("cooking", "z") → "msg_3"
msg_counter 跨晒所有 topic 一齊數
format: "msg_" + str(counter)
get_message("cooking", "msg_1") → "hello"
get_message("cooking", "msg_999") → None
get_message("unknown_topic", "msg_1") → None
# Linear scan 個 messages list 搵相同 id
# L3 加:scan 之前要 _purge_expired
top_topics = 邊個 topic 最多 message list_subscribers = 邊啲人 subscribe 咗 tie-break = 同分點排
def top_topics(self, n): # 返回 message 最多嘅 N 個 topic
for tid in self.topics: # L3 用:先清所有 topic 嘅過期
self._purge_expired(tid) # 清過期 msg
counts = [] # 收集 (topic, count) pairs
for tid in self.topics: # 逐個 topic 數 message 數量
count = len(self.topics[tid]["messages"]) # 數呢個 topic 有幾多 msg
counts.append((tid, count)) # 記住 topic + 數量
# 排序:count 大嘅排先;同 count 嗰陣,topic_id 字母細嘅排先
for i in range(len(counts)): # Bubble sort 外迴圈
for j in range(i + 1, len(counts)): # 同每個後面嘅比
swap = False # 預設唔換
if counts[j][1] > counts[i][1]: # count 大 → swap
swap = True # count 大嘅排前
elif counts[j][1] == counts[i][1]: # 同 count
if counts[j][0] < counts[i][0]: # 同 count → 字母細嘅排先
swap = True # 字母細排前
if swap: # 要唔要換位
counts[i], counts[j] = counts[j], counts[i] # 交換位置
result = [] # 收集格式化結果
for k in range(min(n, len(counts))): # 攞頭 N 個
tid = counts[k][0] # topic id
count = counts[k][1] # message 數量
result.append(tid + "(" + str(count) + ")") # format "cooking(3)"
return result # 返回排好嘅 list
def list_subscribers(self, topic_id): # 列出某 topic 嘅所有 subscriber
if topic_id not in self.topics: # Topic 唔存在 → 空 list
return [] # 冇就返空 list
subs = list(self.topics[topic_id]["subscribers"]) # set → list
subs.sort() # 字母排序
return subs # 返回排好序嘅 subscribers
def __init__(self):
self.topics = {}
self.offsets = {}
self.msg_counter = 0
self.topics = { 頻道目錄(同 L1)
"cooking": {
"subscribers": {"alice", "bob"}, 訂閱者
"messages": [
{"id": "msg_1", "content": "hello", "timestamp": 1000.0},
{"id": "msg_2", "content": "wow", "timestamp": 2000.0},
],
},
}
self.offsets = { 讀進度
("cooking", "alice"): 1,
("cooking", "bob"): 0,
}
self.msg_counter = 2 全局 msg 編號
self.ttl_ms = None message 過期時間(L3 先加)
self.locks = defaultdict(asyncio.Lock) per-topic 鎖(L5 先加)
Primary key: message count DESCENDING
Tie-break key: topic_id ASCENDING (字母)
例:
"cooking" → 3 條 msg
"gaming" → 3 條 msg
"music" → 5 條 msg
"art" → 0 條 msg
top_topics(10) 結果:
"music(5)" ← count 最大
"cooking(3)" ← tie,"c" < "g"
"gaming(3)"
"art(0)"
Format: topic_id + "(" + count + ")"
"cooking" + "(" + "3" + ")" → "cooking(3)"
注意:唔係 "cooking: 3",係括號包住個 number
# subscribers 係 set,要轉做 list 先 sort
# set 冇順序,list 先有
list_subscribers("cooking") → ["alice", "bob", "carol"]
list_subscribers("empty_topic") → []
list_subscribers("unknown") → [] ← 唔 raise,return []
TTL = time-to-live,過咗呢個時間 message 就過期 lazy = 唔係定時清,係 read 前先清
# __init__ 改:accept ttl_ms 參數
def __init__(self, ttl_ms=None):
self.topics = {} # 所有 topic
self.offsets = {} # (topic, user) → offset
self.msg_counter = 0 # 全局 msg 編號
self.ttl_ms = ttl_ms # L3 新加:None = 永不過期
# publish 改:每條 msg 記 timestamp(其實 L1 已經寫咗 self._now_ms())
msg_obj = { # 砌 message dict
"id": msg_id, # message 編號
"content": message, # 內容
"timestamp": self._now_ms(), # L3 用:出片嘅毫秒 timestamp
}
# _purge_expired helper(read 之前 call)
def _purge_expired(self, topic_id): # 清走過期 message(lazy);唔係背景工人清,係有人讀 topic 前先順手掃
if self.ttl_ms is None: # 冇 TTL → skip
return # 唔使清
if topic_id not in self.topics: # Topic 唔存在?
return # Topic 唔存在直接走
now = self._now_ms() # 攞而家時間
valid = [] # 收集未過期嘅 msg
for msg in self.topics[topic_id]["messages"]: # 逐條 message 睇
age = now - msg["timestamp"] # 計算呢條 msg 幾舊
if age <= self.ttl_ms: # 仲後生 → 留
valid.append(msg) # 呢條仲有效,留低
self.topics[topic_id]["messages"] = valid # 用新 list 取代舊嘅
# 所有 read function 開頭都加:
# - get_message: self._purge_expired(topic_id)
# - top_topics: for tid in self.topics: self._purge_expired(tid)
# - consume: self._purge_expired(topic_id)
def __init__(self, ttl_ms=None):
self.topics = {}
self.offsets = {}
self.msg_counter = 0
self.ttl_ms = ttl_ms ← L3 加
self.topics = { 頻道目錄
"cooking": {
"subscribers": {"alice"}, 訂閱者
"messages": [
{
"id": "msg_1",
"content": "hello",
"timestamp": 1700000000000.0, 出片時間(L3 purge 用)
},
],
},
}
self.offsets = { 讀進度
("cooking", "alice"): 0,
}
self.msg_counter = 1 全局 msg 編號
self.ttl_ms = 5000 message 過期時間(L3 加)
self.locks = defaultdict(asyncio.Lock) per-topic 鎖(L5 先加)
1. publish 嘅時候:時間照記,唔做嘢
2. read 嘅時候(get_message / top_topics / consume):
先掃成個 messages list
age > ttl_ms 嘅 → 由 list 刪走
然後先做本來嘅 read
點解 lazy?
- 唔使開 timer / background thread
- 冇人讀就唔使做嘢,慳 CPU
- 副作用:messages list 長度會變
Purge 之後 messages list 縮短
之前 offset = 5 嘅,purge 後可能 list 只有 3 條
→ offset >= len(messages) → consume 返 None
L4 consume 入面要處理
consume = 攞下一條未讀 offset = 已讀到第幾條(index) 每 (topic, user) 配對有自己嘅 offset
def consume(self, topic_id, user_id): # 返下一條未讀 msg,offset++
self._purge_expired(topic_id) # L3 用:先清過期
if topic_id not in self.topics: # Topic 唔存在 → None
return None # 搵唔到
if user_id not in self.topics[topic_id]["subscribers"]: # 未訂閱 → None
return None # 搵唔到
key = (topic_id, user_id) # 組成 tuple key
if key not in self.offsets: # 應該唔會發生,保險
return None # 搵唔到
messages = self.topics[topic_id]["messages"] # 攞 message list
offset = self.offsets[key] # 攞已讀位置
if offset >= len(messages): # 已讀晒(或 purge 之後超出範圍)
return None # 搵唔到
msg = messages[offset] # 攞下一條未讀
self.offsets[key] = offset + 1 # offset++(已讀)
return msg["content"] # 返回內容
def __init__(self, ttl_ms=None):
self.topics = {}
self.offsets = {}
self.msg_counter = 0
self.ttl_ms = ttl_ms
self.topics = { 頻道目錄
"cooking": {
"subscribers": {"alice", "bob"}, 訂閱者
"messages": [
{"id": "msg_1", "content": "v1", "timestamp": 1000.0},
{"id": "msg_2", "content": "v2", "timestamp": 2000.0},
{"id": "msg_3", "content": "v3", "timestamp": 3000.0},
],
},
}
self.offsets = { 讀進度
("cooking", "alice"): 1, alice 讀完 msg_1,下次攞 msg_2
("cooking", "bob"): 0, bob 一條都未讀
}
self.msg_counter = 3 全局 msg 編號
self.ttl_ms = 5000 message 過期時間
self.locks = defaultdict(asyncio.Lock) per-topic 鎖(L5 先加)
offset = 「下一條未讀 message 喺 list 入面嘅 index」
offset = 0 → 下一條係 messages[0]
offset = 3 → 下一條係 messages[3]
offset == len(messages) → 已讀晒
唔係「已讀數量」(雖然數字一樣)
而係「下次由邊個 index 攞」
offset = 5(讀到第 5 條)
Purge 之後 messages 只剩 3 條(前 4 條過期咗)
offset 5 >= len 3 → return None
即係用戶錯過咗一啲過期 message,consume 攞唔返
呢個係 lazy purge 嘅 trade-off
subscribe("cooking", "alice"):
offsets[("cooking", "alice")] = len(messages)
即係:alice 由 subscribe 嗰一刻之後嘅 msg 開始收
Subscribe 之前嘅 msg 永遠 consume 唔到
batch_publish / batch_consume 並發跑 每個 topic 一把 lock,防止同 topic 撞
# __init__ 加:
self.locks = defaultdict(asyncio.Lock) # 每個 topic_id 自動有一把 lock
async def batch_publish(self, operations): # operations = [(topic, msg), ...]
async def _single_publish(topic_id, message): # 發佈一條 message
async with self.locks[topic_id]: # 攞呢個 topic 嘅鎖
return self.publish(topic_id, message) # 真正 publish
tasks = [] # 收集 coroutine
for topic_id, message in operations: # 砌一 list 嘅 coroutine
task = _single_publish(topic_id, message) # 砌 publish coroutine
tasks.append(task) # 將呢項塞入 list,留待之後一齊處理或回傳
results = await asyncio.gather(*tasks) # 全部並發等齊
return list(results) # 返回全部結果
async def batch_consume(self, operations): # operations = [(topic, user), ...]
async def _single_consume(topic_id, user_id): # 消費一條 message
async with self.locks[topic_id]: # 鎖住 topic
return self.consume(topic_id, user_id) # 真正 consume
tasks = [] # 收集 coroutine
for topic_id, user_id in operations: # 逐個 consume op
task = _single_consume(topic_id, user_id) # 砌 consume coroutine
tasks.append(task) # 將呢項塞入 list,留待之後一齊處理或回傳
results = await asyncio.gather(*tasks) # 並發跑全部
return list(results) # 返回全部結果
def __init__(self, ttl_ms=None):
self.topics = {}
self.offsets = {}
self.msg_counter = 0
self.ttl_ms = ttl_ms
self.locks = defaultdict(asyncio.Lock) ← L5 加
self.locks["cooking"] 第一次 access:
key 唔存在 → auto-create asyncio.Lock() → 存返入去
之後再 access self.locks["cooking"]:
攞返同一把 lock
即係:每個 topic_id 都會有自己嘅 lock
唔同 topic 嘅 lock 唔會互相阻塞
同 topic 兩個 op 就要排隊
全局一把鎖:所有 publish 都要排隊 → 慢
Per-topic 鎖:
batch_publish([("a","x"), ("b","y"), ("a","z")])
"a","x" 同 "a","z" 排隊(同 topic)
"b","y" 同時跑(唔同 topic)
→ 唔同 topic 真正並發
For 每個 subscriber call push_func。max_concurrent 用 Semaphore 限並發。Try/except 包住,唔 skip 任何一個。
async def push_notifications(self, topic_id, message, push_func, max_concurrent=3): # 並發推送(ALL-SLEEP 或 fail-fast)
if topic_id not in self.topics: # Topic 唔存在 → 空 list
return [] # 冇就返空 list
# Snapshot:記住而家呢一刻嘅 subscribers(之後退訂都唔影響)
subscribers = list(self.topics[topic_id]["subscribers"]) # Snapshot 而家嘅 subscriber list
semaphore = asyncio.Semaphore(max_concurrent) # 最多 N 個同時飛
async def _push_one(user_id): # 推送畀一個 user
async with semaphore: # 排隊攞 semaphore
try: # 嘗試 push
result = await push_func(user_id, message) # 真係 call push
return { # 返回成功 dict
"user_id": user_id, # 記低邊個 user
"status": "success", # push 成功
"result": result, # push function 嘅返回值
}
except Exception as e: # 失敗都要記低,唔 skip
return { # 返回錯誤 dict
"user_id": user_id, # 記低邊個 user
"status": "error", # push 失敗
"result": str(e), # 錯誤訊息
}
tasks = [] # 收集 coroutine
for uid in subscribers: # 每個 subscriber 一個 coroutine
tasks.append(_push_one(uid)) # 加一個 push 任務
results = await asyncio.gather(*tasks) # 全部跑齊先 return
return list(results) # 返回全部結果
def __init__(self, ttl_ms=None):
self.topics = {}
self.offsets = {}
self.msg_counter = 0
self.ttl_ms = ttl_ms
self.locks = defaultdict(asyncio.Lock)
asyncio.Semaphore(3):counter 由 3 開始
async with semaphore:
入嗰陣 counter -= 1(如果 0 就等)
出嗰陣 counter += 1(喚醒等緊嘅)
例:5 個 subscriber,max_concurrent=3
coroutine 1,2,3 拎到 semaphore → 同時 push
coroutine 4,5 等
1 完成放 → 4 拎到 → push
2 完成放 → 5 拎到 → push
→ 同時最多 3 個 push 飛緊
Push 一個 subscriber 失敗 ≠ skip 其他
每個 subscriber 都會被 attempt:
- 成功:{"status": "success", "result": <return value>}
- 失敗:{"status": "error", "result": <error message>}
try/except 包住 push_func,exception 被「食咗」
變做 dict 入個 result list
所以 asyncio.gather 唔會 raise,全部跑完先 return
subscribers = list(self.topics[topic_id]["subscribers"])
用 list() 做 snapshot(copy)
點解?
- push_func 可能跑好耐(network call)
- 跑緊嗰陣,有人可能 unsubscribe
- 如果直接 iterate 個 set,可能漏 push 或者 error
- Snapshot 之後就鎖死晒呢一刻嘅 subscriber list
WhatsApp 群組 chat。User join channel、send message。Message 可以 reply 開 thread。後面加 message TTL、inactive user auto-remove、channel merge、async batch。
想像個 WhatsApp 系統:
┌─────────────────────────────────────────┐
│ Channel: #lunch │
│ Users: {alice, bob, charlie} │
│ Messages: │
│ msg_1 alice "食咩好?" │
│ msg_2 bob "茶記?" ← reply ┐ │
│ msg_3 charlie └→ "我都想" │ │
│ msg_4 alice "OK 12 點" │
└─────────────────────────────────────────┘
同 Bank 嘅最大分別:
Bank 得 1 個 collection:self.accounts
Chat 有 3 個 collection:
self.channels ← 邊個 channel 有邊啲 users + message_ids
self.messages ← msg_id 對應一條 message 嘅 detail
self.user_activity ← (channel, user) 最後活動時間
Multi-collection 意思:
一個動作可能要更新多個 collection。
例如 send_message:
1. self.messages[msg_id] = {...} ← 新 message 入庫
2. channel["message_ids"].append ← channel 記住 msg_id
3. self.user_activity[key] = now ← 用戶活動更新
Level 加嘢順序:
L1 CRUD:
create_channel / join_channel / send_message / get_message
send_message return msg_id(format: "msg_{counter}")
L2 Sort / Filter / Search:
list_channels — 全部 channel 名,字母順
top_channels(n) — 按 message count 排,format "channel(count)"
search_messages — keyword 搜尋(case-insensitive)
L3 Time-based(TTL):
__init__(message_ttl_ms, inactive_ttl_ms)
過期 message 唔再出現(lazy expiry)
太耐冇發言嘅 user 自動踢走
L4 Thread + Merge:
reply_to — 開 thread,reply 都係獨立 message
get_thread — 攞 parent + replies
merge_channels — source 搬去 target
L5 Async batch:
batch_send / batch_search
Lock per channel_id
L6 Sync messages:
sync_messages 派去外部
Fail-fast:channel 冇 message → 即 return []
import time
import asyncio
from collections import defaultdict
class ChatSystem:
def __init__(self, message_ttl_ms=None, inactive_ttl_ms=None):
self.channels = {} # L1:每個 channel 嘅 users + message_ids
self.messages = {} # L1:msg_id → message dict
self.user_activity = {} # L3:(channel_id, user_id) → last ms
self.msg_counter = 0 # L1:全局 message 編號
self.message_ttl_ms = message_ttl_ms # L3 加:message 過期時間
self.inactive_ttl_ms = inactive_ttl_ms # L3 加:user 唔活躍時間
self.locks = defaultdict(asyncio.Lock) # L5 加:per-channel lock
Bank:得一個 self.accounts dict
Chat:3 個獨立 dict + 1 個 counter
點解要 3 個 dict?
因為 messages 同 channels 係 many-to-many:
一個 channel 有好多 messages
一個 message 屬於一個 channel
reply 又指向 parent message
將 messages 獨立儲存,用 msg_id 做 key,
channel 入面只記低 msg_id list,
咁攞 message 快過喺 channel 入面 loop。
self.channels = {
"#lunch": {
"users": {"alice", "bob"},
"message_ids": ["msg_1", "msg_2"],
},
}
self.messages = {
"msg_1": {
"msg_id": "msg_1",
"channel_id": "#lunch",
"user_id": "alice",
"content": "食咩好?",
"timestamp": 1700000000000,
"replies": [],
},
}
self.user_activity = {
("#lunch", "alice"): 1700000000000,
("#lunch", "bob"): 1700000005000,
}
self.channels = {}
self.messages = {}
self.user_activity = {} # L1 join_channel 已經會用到
self.msg_counter = 0
# L3 先用到:
# self.message_ttl_ms
# self.inactive_ttl_ms
# L5 先用到:
# self.locks
def _now_ms(self): # 攞而家嘅毫秒
return time.time() * 1000 # 將計好嘅值交返 caller;之後外面會靠呢個結果再判斷
def _next_msg_id(self): # 派下一個 msg_id("msg_1"、"msg_2"...)
self.msg_counter = self.msg_counter + 1 # 將新值寫落呢格 state;等於而家正式更新咗紀錄
return "msg_" + str(self.msg_counter) # 將計好嘅值交返 caller;之後外面會靠呢個結果再判斷
def _purge_expired_messages(self, channel_id): # L3 用:lazy 清過期 message
if self.message_ttl_ms is None: # 冇設 TTL → 唔使做嘢
return # 返(冇 return 值)
if channel_id not in self.channels: # channel_id 唔存在
return # 返(冇 return 值)
now = self._now_ms() # 攞 self._now_ms
valid_ids = [] # 先開個空 list,等陣逐項放結果或工作入去
for mid in self.channels[channel_id]["message_ids"]: # 逐條 msg_id 睇
if mid not in self.messages: # mid 唔存在
continue # 跳過
msg = self.messages[mid] # 攞 messages 入面嘅值
age = now - msg["timestamp"] # 計呢條 message 幾耐
if age <= self.message_ttl_ms: # 呢度係分流位;條件唔同就會走去唔同分支
valid_ids.append(mid) # 未過期 → 留低
else: # 呢條 message 已經過咗壽命;唔好再留喺頻道裝作仲睇得到
del self.messages[mid] # 過期 → 連 messages 入面都刪
self.channels[channel_id]["message_ids"] = valid_ids # 將頻道 message 清單換成淨返未過期嗰批;之後讀 channel 就唔會再見到死訊息
def _purge_inactive_users(self, channel_id): # L3 用:lazy 踢冇活動嘅用戶
if self.inactive_ttl_ms is None: # 呢度係分流位;條件唔同就會走去唔同分支
return # 返(冇 return 值)
if channel_id not in self.channels: # channel_id 唔存在
return # 返(冇 return 值)
now = self._now_ms() # 攞 self._now_ms
to_remove = [] # 先開個空 list,等陣逐項放結果或工作入去
for uid in self.channels[channel_id]["users"]: # 逐項巡一次;每個元素都會照同一套規則處理
key = (channel_id, uid) # 砌 key
if key in self.user_activity: # key 已經存在
idle = now - self.user_activity[key] # 將新值寫落呢格 state;等於而家正式更新咗紀錄
if idle > self.inactive_ttl_ms: # 呢度係分流位;條件唔同就會走去唔同分支
to_remove.append(uid) # 太耐冇活動 → 加入待踢 list
for uid in to_remove: # 逐項巡過去;每次處理一個元素
self.channels[channel_id]["users"].discard(uid) # 正式將呢個太耐冇活動嘅人踢出群組 users set
key = (channel_id, uid) # 砌 key
if key in self.user_activity: # key 已經存在
del self.user_activity[key] # 由 user_activity 移走
time.time() → 1700000000.123(秒)
time.time()*1000 → 1700000000123(毫秒)
# 因為 TTL 用毫秒,所以全部時間都用毫秒
self.msg_counter = 0
_next_msg_id() → "msg_1" counter=1
_next_msg_id() → "msg_2" counter=2
_next_msg_id() → "msg_3" counter=3
# 全局 counter,全部 channel 共用
# Reply 都係用呢個 counter,唔分
逐條 message 睇,如果 age > TTL 就刪。Lazy 意思:唔自動 timer 跑,等有人查嗰陣(例如 get_message、search、top_channels)先 call。冇 TTL(None)就唔使做。
逐個 user 睇佢喺呢個 channel 嘅 last activity,如果 idle > TTL 就踢。Spec 要求喺 send_message 開頭 call 一次。
例子:
inactive_ttl_ms = 1000
user_activity[("#lunch", "alice")] = 0
user_activity[("#lunch", "bob")] = 1500
而家 now = 2000
alice idle = 2000 - 0 = 2000 > 1000 → 踢
bob idle = 2000 - 1500 = 500 <= 1000 → 留低
channel = 群組 join = 入群 send_message = 發訊息 msg_id = 條 message 嘅編號
def create_channel(self, channel_id): # 開個新群組
if channel_id in self.channels: # 已經存在 → return False
return False # 失敗就返 False;caller 可以當今次要求冇落地
self.channels[channel_id] = { # 開一個新 entry
"users": set(), # 用 set 自動去重
"message_ids": [], # 用 list 保留發訊次序
}
return True # 成功就返 True;caller 可以當今次動作真係做咗
def join_channel(self, channel_id, user_id): # 用戶入群
if channel_id not in self.channels: # 如果張 channel 唔存在,就 return False
return False # 失敗就返 False;caller 可以當今次要求冇落地
if user_id in self.channels[channel_id]["users"]: # 已經 join 過 → False
return False # 失敗就返 False;caller 可以當今次要求冇落地
self.channels[channel_id]["users"].add(user_id) # 將新住客加落呢個群組名單;之後先有資格 send_message
self.user_activity[(channel_id, user_id)] = self._now_ms() # 記 join 時間做初始活動
return True # 成功就返 True;caller 可以當今次動作真係做咗
def send_message(self, channel_id, user_id, content): # 發 message
self._purge_inactive_users(channel_id) # 第一步:先將太耐冇講嘢嘅人踢出群,免得幽靈住戶繼續發言
if channel_id not in self.channels: # 群組都唔存在,就好似連店舖都未開門
return None # caller 收到 None 就知連 msg_id 都冇派到,條訊息根本冇落地
if user_id not in self.channels[channel_id]["users"]: # 發訊息嗰個人唔喺群入面
return None # 唔係場內住客就唔可以落單;直接話呢張 message 無效
# 第二步:派新 msg_id,同時記低發送時間
msg_id = self._next_msg_id() # 幫呢條新 message 派一張獨一無二嘅票尾
now = self._now_ms() # 記低真正發言時間;之後排 thread / TTL 都會靠佢
# 第三步:同時更新 messages、channel 索引同 user_activity 三個 collection
self.messages[msg_id] = { # 主帳本:條 message 本體放喺呢度
"msg_id": msg_id, # 票尾號碼;之後 caller 拎住佢再查返內容
"channel_id": channel_id, # 呢條 message 屬於邊個群組
"user_id": user_id, # 邊個用戶講嘅
"content": content, # 真正訊息內容
"timestamp": now, # 用嚟做排序、過期判斷同 thread 時序
"replies": [], # 預留一格俾之後嘅 reply;等於先開定回覆清單
}
self.channels[channel_id]["message_ids"].append(msg_id) # 群組時間線追加一個新 msg_id;等於將票尾掛上告示板
self.user_activity[(channel_id, user_id)] = now # 更新用戶最後活躍時間,證明佢啱啱仲喺場
return msg_id # 成功就交返新票尾;caller 之後可以靠佢 get_message / reply
def get_message(self, msg_id): # 攞某條 message
if msg_id not in self.messages: # 主帳本冇呢張票尾,就代表條訊息本身不存在
return None # caller 收到 None,就知查唔返任何內容
msg = self.messages[msg_id] # 先摸到條 message 本體;等陣要用佢個 channel 去做 lazy purge
self._purge_expired_messages(msg["channel_id"]) # 第一步:先清呢個群已過期嘅舊訊息,避免交出一張其實應該失效嘅票
if msg_id not in self.messages: # purge 後可能發現佢其實已經過咗期
return None # 咁就當條訊息已經蒸發;caller 唔應該再見到佢
# 第二步:回傳一份乾淨快照,等 caller 讀內容但唔會直接摸到內部 dict
return { # 回傳係 message 快照,唔係內部原件 reference
"msg_id": msg["msg_id"], # 返返同一張票尾,方便 caller 對得返係邊條訊息
"channel_id": msg["channel_id"], # 告訴 caller 呢條訊息原本屬於邊個群
"user_id": msg["user_id"], # 邊個講嘅
"content": msg["content"], # 真正訊息內容
"timestamp": msg["timestamp"], # 發送時間;之後做排序/比對都靠呢個
}
Bank L1:得 1 個 collection
create_account → 加去 self.accounts
deposit → 改 self.accounts[id]["balance"]
Chat L1:3 個 collection 一齊改
create_channel → 加去 self.channels
join_channel → 改 self.channels[id]["users"]
+ 加去 self.user_activity
send_message → 加去 self.messages
+ 改 self.channels[id]["message_ids"]
+ 更新 self.user_activity
即係話一個 method call 可能要 touch 2-3 個 dict。
唔好漏,唔好次序錯。
def __init__(self):
self.channels = {}
self.messages = {}
self.user_activity = {}
self.msg_counter = 0
self.channels = {
"#lunch": {
"users": {"alice", "bob"},
"message_ids": ["msg_1", "msg_2"],
},
}
self.messages = {
"msg_1": {
"msg_id": "msg_1",
"channel_id": "#lunch",
"user_id": "alice",
"content": "食咩好?",
"timestamp": 1700000000000,
"replies": [],
},
}
self.user_activity = {
("#lunch", "alice"): 1700000000000,
}
self.msg_counter = 2
state 開頭:
channels["#lunch"] = {
"users": {"alice"},
"message_ids": [],
}
messages = {}
msg_counter = 0
send_message("#lunch", "alice", "hi"):
Step 1: _purge_inactive_users("#lunch")
L1 inactive_ttl_ms = None → 即 return(唔做嘢)
Step 2: channel 存在?✅
Step 3: alice 喺 channel?✅
Step 4: msg_id = _next_msg_id() → "msg_1"
Step 5: now = _now_ms() → 1700000000000
Step 6: messages["msg_1"] = {
"msg_id": "msg_1",
"channel_id": "#lunch",
"user_id": "alice",
"content": "hi",
"timestamp": 1700000000000,
"replies": [],
}
Step 7: channels["#lunch"]["message_ids"]
.append("msg_1")
Step 8: user_activity[("#lunch", "alice")]
= 1700000000000
return "msg_1"
get_message("msg_1") → {
"msg_id": "msg_1",
"channel_id": "#lunch",
"user_id": "alice",
"content": "hi",
"timestamp": 1700000000000,
}
# 注意:return dict 冇 "replies" field
# 因為 spec 講明 get_message 嘅 schema
# 唔包 replies。Replies 要 call get_thread 先有。
get_message("msg_999") → None
list_channels = 全部 channel 名 top_channels = 按 message 多少排 search_messages = 文字搜尋
def list_channels(self): # 列晒所有 channel,字母順
result = list(self.channels.keys()) # 將新值寫落呢格 state;等於而家正式更新咗紀錄
result.sort() # 就地重新排次序,等輸出符合 spec 要求
return result # 將整理好嘅結果交返 caller;外面就拎住呢份清單或報表去用
def top_channels(self, n): # 按 message count 排名,format "channel(count)"
for cid in self.channels: # L3:先清晒過期 message 至計數
self._purge_expired_messages(cid) # 先交俾 helper 做前置清理或者共用邏輯;主 function 再接手
counts = [] # 先開個空 list,等陣逐項放結果或工作入去
for cid in self.channels: # 逐個 channels 行
count = len(self.channels[cid]["message_ids"]) # 將新值寫落呢格 state;等於而家正式更新咗紀錄
counts.append((cid, count)) # 將呢項塞入 list,留待之後一齊處理或回傳
# 排序:count 大嘅排先,同 count 按 channel_id 字母排
for i in range(len(counts)): # loop i 次
for j in range(i + 1, len(counts)): # loop j 次
swap = False # 將新值寫落呢格 state;等於而家正式更新咗紀錄
if counts[j][1] > counts[i][1]: # j 嘅 count 大過 i → swap
swap = True # 將新值寫落呢格 state;等於而家正式更新咗紀錄
elif counts[j][1] == counts[i][1]: # 另一個情況
if counts[j][0] < counts[i][0]: # 同 count,j 嘅名細啲
swap = True # 將新值寫落呢格 state;等於而家正式更新咗紀錄
if swap: # 如果 swap 為真
counts[i], counts[j] = counts[j], counts[i] # 將兩個 ranking 位對調;咁先符合 count desc / name asc 嘅排序規矩
result = [] # 先開個空 list,等陣逐項放結果或工作入去
for k in range(min(n, len(counts))): # 攞頭 n 個
cid = counts[k][0] # 攞 counts 入面嘅值
count = counts[k][1] # 攞 counts 入面嘅值
result.append(cid + "(" + str(count) + ")") # 將呢項塞入 list,留待之後一齊處理或回傳
return result # 將整理好嘅結果交返 caller;外面就拎住呢份清單或報表去用
def search_messages(self, channel_id, keyword): # 喺一個 channel 搜 keyword
self._purge_expired_messages(channel_id) # 先交俾 helper 做前置清理或者共用邏輯;主 function 再接手
if channel_id not in self.channels: # channel_id 唔存在
return [] # channel 唔存在 → 空 list
keyword_lower = keyword.lower() # case-insensitive
matched = [] # 先開個空 list,等陣逐項放結果或工作入去
for mid in self.channels[channel_id]["message_ids"]: # 逐項巡一次;每個元素都會照同一套規則處理
if mid not in self.messages: # mid 唔存在
continue # 跳過
msg = self.messages[mid] # 攞 messages 入面嘅值
# 逐條 message 睇,如果 content 入面有個 keyword 就加入 result
if keyword_lower in msg["content"].lower(): # 呢度係分流位;條件唔同就會走去唔同分支
matched.append((msg["timestamp"], msg["msg_id"])) # 將呢項塞入 list,留待之後一齊處理或回傳
# 按 timestamp 升序(最早嗰條排第一)
for i in range(len(matched)): # loop i 次
for j in range(i + 1, len(matched)): # loop j 次
if matched[j][0] < matched[i][0]: # 呢度係分流位;條件唔同就會走去唔同分支
matched[i], matched[j] = matched[j], matched[i] # 將較早嗰條 message 換前,確保搜尋結果由舊到新
result = [] # 先開個空 list,等陣逐項放結果或工作入去
for ts, mid in matched: # 逐個行(ts, mid)
result.append(mid) # 將呢項塞入 list,留待之後一齊處理或回傳
return result # 將整理好嘅結果交返 caller;外面就拎住呢份清單或報表去用
Bank L2:top_spenders 排 1 個 collection(accounts)
Chat L2:
list_channels → 排 channels keys
top_channels → channels 但要 count 第 2 個 dict(messages)
search_messages → 跨 2 個 dict 嘅查詢:
先 channels[cid][message_ids] 攞 list
再去 messages[mid] 攞 content
Multi-collection 嘅讀法:
通常 channel dict 入面淨係存 id 做 reference,
真正 data 要去另一個 dict 攞。
def __init__(self):
self.channels = {}
self.messages = {}
self.user_activity = {}
self.msg_counter = 0
self.channels = {...}
self.messages = {...}
self.user_activity = {...}
# L2 全部係 query method,
# 唔加新 field,唔加新 dict。
state:
channels["#lunch"]["message_ids"] = ["m1","m2","m3"]
channels["#dev"]["message_ids"] = ["m4"]
channels["#cat"]["message_ids"] = ["m5","m6","m7"]
Step 1: 清過期(L1 冇用,L3 先要)
Step 2: 計 count
counts = [
("#lunch", 3),
("#dev", 1),
("#cat", 3),
]
Step 3: 排序(count desc,同分按名 asc)
("#cat", 3) ← 同 count,#cat < #lunch
("#lunch", 3)
("#dev", 1)
Step 4: 攞頭 2,format
→ ["#cat(3)", "#lunch(3)"]
state:
channels["#lunch"]["message_ids"] = ["m1","m2","m3"]
messages["m1"]["content"] = "ok lunch"
messages["m1"]["timestamp"] = 100
messages["m2"]["content"] = "12 點"
messages["m2"]["timestamp"] = 50
messages["m3"]["content"] = "OK la"
messages["m3"]["timestamp"] = 200
Step 1: keyword_lower = "ok"
Step 2: 逐條 msg:
m1: "ok lunch".lower() 有 "ok"? ✅
→ matched = [(100, "m1")]
m2: "12 點" 有 "ok"? ❌
m3: "OK la".lower() 有 "ok"? ✅
→ matched = [(100,"m1"), (200,"m3")]
Step 3: 按 timestamp asc:
(100,"m1") 喺 (200,"m3") 前 → 已經 sort 好
Step 4: 攞 msg_id → ["m1", "m3"]
TTL = 有效期 lazy = 唔自動 timer,等人查嗰陣先清 inactive = 太耐冇發言
# L3 唔加新 method,係改 __init__ 同改 send_message / get_message
# 兩個 helper(喺上面 Helpers section):
# _purge_expired_messages → message age > ttl → 刪
# _purge_inactive_users → user idle > ttl → 踢
def __init__(self, message_ttl_ms=None, inactive_ttl_ms=None): # L3 改:加兩個 TTL param
self.channels = {} # 重設 self.channels
self.messages = {} # 重設 self.messages
self.user_activity = {} # 重設 self.user_activity
self.msg_counter = 0 # 更新 self.msg_counter
self.message_ttl_ms = message_ttl_ms # L3 加
self.inactive_ttl_ms = inactive_ttl_ms # L3 加
# ─── send_message L3 改:開頭加 _purge_inactive_users ───
def send_message(self, channel_id, user_id, content): # 喺 channel 發一條 message
self._purge_inactive_users(channel_id) # ← L3 加:先踢冇活動嘅用戶
if channel_id not in self.channels: # channel_id 唔存在
return None # 呢度用 None 表示今次搵唔到結果,或者動作冇成功落地
if user_id not in self.channels[channel_id]["users"]: # 呢度係分流位;條件唔同就會走去唔同分支
return None # 啱啱被踢嘅都會 fall 入呢個 case
# ... 其餘同 L1 一樣
# ─── get_message L3 改:先 purge 過期 ───
def get_message(self, msg_id): # 攞一條 message
if msg_id not in self.messages: # msg_id 唔存在
return None # 呢度用 None 表示今次搵唔到結果,或者動作冇成功落地
msg = self.messages[msg_id] # 攞 messages 入面嘅值
self._purge_expired_messages(msg["channel_id"]) # ← L3 加
if msg_id not in self.messages: # 可能啱啱被 purge 走
return None # 呢度用 None 表示今次搵唔到結果,或者動作冇成功落地
# ... 其餘同 L1 一樣
Bank L3:schedule_payment 加 cashback 處理
每次操作開頭 call _process_cashbacks(now)
Chat L3:兩個獨立 TTL,唔同 method 入面 trigger:
send_message → purge inactive users(user lifecycle)
get_message / search / top → purge messages(msg lifecycle)
兩個都係 lazy:唔自動 timer,等查嗰陣先做。
兩個都係寫入個 helper,響應 self.X_ttl_ms。
TTL = None 即係冇 enable,helper 第一行直接 return。
def __init__(self,
message_ttl_ms=None,
inactive_ttl_ms=None):
self.channels = {}
self.messages = {}
self.user_activity = {}
self.msg_counter = 0
self.message_ttl_ms = message_ttl_ms ← L3 新
self.inactive_ttl_ms = inactive_ttl_ms ← L3 新
# channels / messages / user_activity 結構唔變
# 加 self level:
self.message_ttl_ms = 1000 # message 1 秒過期
self.inactive_ttl_ms = 5000 # user 5 秒冇活動就踢
inactive_ttl_ms = 1000
state @ time=0:
channels["#dev"]["users"] = {"alice", "bob"}
user_activity[("#dev","alice")] = 0
user_activity[("#dev","bob")] = 0
@ time=2000,bob send_message:
Step 1: _purge_inactive_users("#dev")
alice idle = 2000-0 = 2000 > 1000 → 踢
bob idle = 2000-0 = 2000 > 1000 → 踢
結果:channels["#dev"]["users"] = set()
Step 2: check bob in users? ❌
→ return None(bob 啱啱被踢)
# 注意:spec 要求 send_message 開頭就 purge,
# 即係連自己都可能被踢。
message_ttl_ms = 1000
state @ time=0:
messages["m1"]["timestamp"] = 0
messages["m2"]["timestamp"] = 500
channels["#dev"]["message_ids"] = ["m1","m2"]
@ time=2000,call get_message("m1"):
Step 1: msg = messages["m1"]
Step 2: _purge_expired_messages("#dev")
m1 age = 2000-0 = 2000 > 1000 → 刪
m2 age = 2000-500 = 1500 > 1000 → 刪
結果:
messages = {}
channels["#dev"]["message_ids"] = []
Step 3: "m1" 仲喺 messages? ❌
→ return None
reply_to = 喺某條 message 下面回覆 thread = parent + replies merge = 兩個 channel 合一
def reply_to(self, msg_id, user_id, content): # 喺某條 message 下面回覆
if msg_id not in self.messages: # 連母訊息都冇,就等於想覆一張根本唔存在嘅單
return None # caller 收到 None,就知 reply 根本冇建立到
parent = self.messages[msg_id] # 先攞返母訊息;等陣要沿住佢知道應該掛喺邊個群
channel_id = parent["channel_id"] # reply 一定跟住母訊息個 channel;唔可以自己亂揀場地
if channel_id not in self.channels: # 母訊息仲喺度,但群組可能已經被 merge / 刪走
return None # 場都冇咗,就唔應該再接新 reply
if user_id not in self.channels[channel_id]["users"]: # 回覆者唔喺呢個群入面
return None # 唔係群組住客就冇資格喺度插嘴
# 第一步:派 reply_id,同時記低回覆時間
reply_id = self._next_msg_id() # 幫新 reply 派一張新票尾
now = self._now_ms() # 記低回覆發生喺幾時,之後 thread 排序要靠佢
# 第二步:將 reply 本體寫入 messages,再同時更新 parent 同 channel 索引
self.messages[reply_id] = { # reply 本身都係一條獨立 message,只係另外掛住母訊息
"msg_id": reply_id, # 新 reply 嘅票尾
"channel_id": channel_id, # 跟母訊息一樣,留喺同一個群
"user_id": user_id, # 邊個用戶覆嘅
"content": content, # reply 內容本體
"timestamp": now, # 回覆時間,用嚟同其他 replies 排先後
"replies": [], # 預留位俾之後再有人覆佢
}
parent["replies"].append(reply_id) # 將新 reply 掛返去母訊息下面;等於留言板加一條子回覆指標
self.channels[channel_id]["message_ids"].append(reply_id) # channel 主時間線都要見到呢條 reply,之後 scan/search 先唔會漏
self.user_activity[(channel_id, user_id)] = now # 更新講者活躍時間,證明佢啱啱仲喺場
return reply_id # 成功就交返 reply 票尾;caller 可以即刻再用佢做 lookup
def get_thread(self, msg_id): # 攞 parent + 所有 replies
if msg_id not in self.messages: # 起點都冇,就砌唔到條 thread
return None # caller 收到 None,就知母訊息不存在
parent = self.messages[msg_id] # 先攞 thread 個主幹;之後會沿住佢個 replies list 收集分支
# 第一步:先將母訊息放入 thread
thread = [] # 呢個 list 會變成成條對話串嘅快照
thread.append({ # 先擺主留言,等於先落 thread 嘅根
"msg_id": parent["msg_id"], # 母訊息票尾
"channel_id": parent["channel_id"], # 所屬群組
"user_id": parent["user_id"], # 邊個開個話題
"content": parent["content"], # 母訊息內容
"timestamp": parent["timestamp"], # 母訊息時間;之後排序要靠佢
})
# 第二步:將仍然存在嘅 replies 一條條抄入 thread
for reply_id in parent["replies"]: # 沿住母訊息條 reply 清單逐條執返出嚟
if reply_id in self.messages: # 防止遇到失效/已刪 reply 時硬撞落去
r = self.messages[reply_id] # 攞該 reply 本體,準備抄成輸出快照
thread.append({ # 將 reply 逐條加入 thread
"msg_id": r["msg_id"], # reply 自己嘅票尾
"channel_id": r["channel_id"], # 仍然屬於同一個群
"user_id": r["user_id"], # 邊個回應咗
"content": r["content"], # reply 內容
"timestamp": r["timestamp"], # reply 發生時間
})
# 第三步:最後按 timestamp 重排,等成條 thread 由最早講到最遲講
for i in range(len(thread)): # 外圈固定一個位置,準備同後面比較
for j in range(i + 1, len(thread)): # 內圈搵有冇更早講嘅 reply 應該放前面
if thread[j]["timestamp"] < thread[i]["timestamp"]: # 如果後面嗰條其實更早,就調位
thread[i], thread[j] = thread[j], thread[i] # 換位後,thread 讀落會由舊到新更自然
return thread # 交返成條對話串快照;caller 一次過就睇到母訊息連所有 replies
def merge_channels(self, source_id, target_id): # 兩個 channel 合一
if source_id not in self.channels: # 源頭群都冇,冇得搬場
return False # 直接話 merge 失敗;連 source 都搵唔到
if target_id not in self.channels: # 目標群唔存在,就好似想搬去一間未開門嘅舖
return False # 冇地方接手,自然唔可以 merge
# 第一步:先搬 users 同 user_activity
for uid in self.channels[source_id]["users"]: # 源頭群每個住客都要重新報到去 target
self.channels[target_id]["users"].add(uid) # target 係 set,所以自然會去重;唔會加出兩個同名住客
old_key = (source_id, uid) # 舊地址:住喺 source 時嗰條活動紀錄 key
new_key = (target_id, uid) # 新地址:搬到 target 後應該用呢條 key
if old_key in self.user_activity: # 只搬真係有活動紀錄嘅住客
old_ts = self.user_activity[old_key] # 先拎住佢喺 source 最後一次出現時間
if new_key in self.user_activity: # target 原本已經有同一個人嘅活動紀錄
if old_ts > self.user_activity[new_key]: # source 嗰筆如果更新鮮,就應該蓋過舊紀錄
self.user_activity[new_key] = old_ts # 保留較新活動時間;等於記住個客最近一次喺邊度講過嘢
else: # target 冇現成紀錄,就直接搬過去
self.user_activity[new_key] = old_ts # 將 source 個活動時間掛到新地址
del self.user_activity[old_key] # 舊地址交吉;全世界之後只認 target 個 key
# 第二步:再搬 messages 同 message_ids 索引
for mid in self.channels[source_id]["message_ids"]: # 源頭群公告板上每張票尾都要轉戶口
if mid in self.messages: # 防止碰到已失效 message id 時出錯
self.messages[mid]["channel_id"] = target_id # 將 message 本體戶籍改去 target,之後查返會見到佢已經搬場
self.channels[target_id]["message_ids"].append(mid) # target 時間線追加呢張舊票尾;保留原 message 本體但換咗場地
# 第三步:拆走 source channel,完成搬場
del self.channels[source_id] # 舊群正式收舖;之後所有人同 message 都只屬於 target
return True # merge 成功完成;caller 可以當 source 已經併入 target
Bank L4:merge_accounts 搬 balance + history
只係 1 個 collection(accounts)
Chat L4 merge_channels:要更新 3 個 dict
1. channels[target]["users"] 加 users
2. user_activity 重 key(source → target)
3. messages[mid]["channel_id"] 改 target
4. channels[target]["message_ids"] append
5. del channels[source]
加 reply_to / get_thread 完全係 Chat 特有:
「message reference message」嘅 self-link,
用 parent["replies"] 做 list of msg_id 實現。
def __init__(self,
message_ttl_ms=None,
inactive_ttl_ms=None):
self.channels = {}
self.messages = {}
self.user_activity = {}
self.msg_counter = 0
self.message_ttl_ms = message_ttl_ms
self.inactive_ttl_ms = inactive_ttl_ms
self.messages = {
"msg_1": {
"msg_id": "msg_1",
"channel_id": "#lunch",
"user_id": "alice",
"content": "食咩好?",
"timestamp": 100,
"replies": ["msg_2", "msg_3"], ← L4 開始用
},
"msg_2": {
"msg_id": "msg_2",
"channel_id": "#lunch",
"user_id": "bob",
"content": "茶記",
"timestamp": 110,
"replies": [],
},
}
# Reply 都喺 channels[cid]["message_ids"] 入面,
# 即係 message list 包埋 reply。
# Reply 同 normal message 嘅唯一分別:
# 有冇一個 parent 嘅 replies list 包含佢。
state:
channels["#lunch"]["users"] = {"alice", "bob"}
messages["msg_1"] = {
..., "content": "食咩好",
"timestamp": 100, "replies": []
}
bob: reply_to("msg_1", "bob", "茶記")
Step 1: msg_1 存在 ✅
Step 2: parent.channel = "#lunch",存在 ✅
Step 3: bob 喺 #lunch ✅
Step 4: reply_id = "msg_2"
Step 5: messages["msg_2"] = {..., timestamp=110, ...}
Step 6: parent["replies"].append("msg_2")
→ messages["msg_1"]["replies"] = ["msg_2"]
Step 7: channels["#lunch"]["message_ids"]
.append("msg_2")
→ return "msg_2"
get_thread("msg_1") → [
{"msg_id": "msg_1", ..., "timestamp": 100},
{"msg_id": "msg_2", ..., "timestamp": 110},
]
# 按 timestamp 升序
state 開頭:
channels["A"]["users"] = {"alice"}
channels["A"]["message_ids"] = ["m1"]
channels["B"]["users"] = {"bob"}
channels["B"]["message_ids"] = ["m2"]
messages["m1"]["channel_id"] = "A"
messages["m2"]["channel_id"] = "B"
user_activity[("A","alice")] = 100
user_activity[("B","bob")] = 200
merge_channels("A", "B"):
搬 users:
B["users"].add("alice") → {"bob","alice"}
user_activity[("B","alice")] = 100
del user_activity[("A","alice")]
搬 messages:
messages["m1"]["channel_id"] = "B"
B["message_ids"].append("m1") → ["m2","m1"]
del channels["A"]
# 結果:
# channels = {"B": {"users":{"alice","bob"},
# "message_ids":["m2","m1"]}}
# messages 全部 channel_id 都係 "B"
batch_send = 一次過發多條 message lock per channel = 每個 channel 有自己嘅鎖
async batch_send(operations):每個 op 係 (channel_id, user_id, content)。asyncio.gather 同時跑。// 同時發
async batch_search(operations):每個 op 係 (channel_id, keyword)。// 同時搜
Lock per channel_id:同一個 channel 嘅操作要排隊,唔同 channel 嘅可以同時跑。// 用 defaultdict(asyncio.Lock)
async def batch_send(self, operations): # 一次過發多條 message
async def _single_send(channel_id, user_id, content): # 發一條 message(async helper)
async with self.locks[channel_id]: # 攞呢個 channel 嘅鎖
return self.send_message(channel_id, user_id, content) # 返 self.send_message 嘅值
tasks = [] # 先開個空 list,等陣逐項放結果或工作入去
for channel_id, user_id, content in operations: # 逐項巡一次;每個元素都會照同一套規則處理
tasks.append(_single_send(channel_id, user_id, content)) # 將呢項塞入 list,留待之後一齊處理或回傳
results = await asyncio.gather(*tasks) # 同時跑全部
return list(results) # gather 返嚟嗰批結果轉成普通 list;caller 之後比較易直接用
async def batch_search(self, operations): # 一次過搜多個 channel
async def _single_search(channel_id, keyword): # 搜尋一個 channel(async helper)
async with self.locks[channel_id]: # 同個 channel 唔好同時搜
return self.search_messages(channel_id, keyword) # 返 self.search_messages 嘅值
tasks = [] # 先開個空 list,等陣逐項放結果或工作入去
for channel_id, keyword in operations: # 逐個行(channel_id, keyword)
tasks.append(_single_search(channel_id, keyword)) # 將呢項塞入 list,留待之後一齊處理或回傳
results = await asyncio.gather(*tasks) # 全部一齊跑,等做完
return list(results) # gather 返嚟嗰批結果轉成普通 list;caller 之後比較易直接用
Bank L5:account_locks = defaultdict(asyncio.Lock)
per-account lock,transfer 兩個 account 要攞兩把鎖
Chat L5:locks = defaultdict(asyncio.Lock)
per-channel lock,batch_send / batch_search
每個 op 攞一把鎖(per channel_id)
兩者結構幾乎一樣,分別淨係 lock 嘅 key 唔同:
Bank → account_id 做 key
Chat → channel_id 做 key
defaultdict(asyncio.Lock) 嘅好處:
第一次 access 自動 new 一把 Lock,唔使預先 setup。
def __init__(self,
message_ttl_ms=None,
inactive_ttl_ms=None):
self.channels = {}
self.messages = {}
self.user_activity = {}
self.msg_counter = 0
self.message_ttl_ms = message_ttl_ms
self.inactive_ttl_ms = inactive_ttl_ms
self.locks = defaultdict(asyncio.Lock) ← L5 加
self.locks = defaultdict(asyncio.Lock)
# 用嗰陣:
self.locks["#lunch"] ← 第一次 access,自動 new Lock
self.locks["#dev"] ← 另一把獨立 Lock
operations = [
("#lunch", "alice", "hi"),
("#lunch", "bob", "yo"),
("#dev", "carol", "ping"),
]
Step 1: 開 3 個 coroutine:
_single_send("#lunch", "alice", "hi")
_single_send("#lunch", "bob", "yo")
_single_send("#dev", "carol", "ping")
Step 2: asyncio.gather → 同時開動
coroutine 1 攞 locks["#lunch"] ✅
呢一刻 coroutine 2 都想攞 locks["#lunch"] → 等
coroutine 3 攞 locks["#dev"] ✅
(#dev 同 #lunch 唔同 lock,可以並行)
coroutine 1 done → 還鎖
coroutine 2 攞 locks["#lunch"] ✅
Step 3: 全部 done → return [msg_id_1, msg_id_2, msg_id_3]
# 即係:
# 同 channel 嘅 op 排隊(互斥)
# 唔同 channel 嘅 op 並行
send_message 入面:
1. _purge_inactive_users(讀寫 users / activity)
2. messages[msg_id] = ...
3. channels[cid]["message_ids"].append(...)
4. user_activity[...] = now
同 channel 同時兩個 send 嘅話:
A 讀 users → 啱啱有 alice
B 讀 users → 啱啱有 alice
A 加 msg_1 喺 message_ids
B 加 msg_2 喺 message_ids
→ list.append 通常安全,但
purge 同時跑可能 race condition
為咗安全,每個 channel 一把鎖。
唔同 channel 嘅 op 之間冇衝突 → 唔阻住。
sync_messages = 派 channel 嘅 message 去外部 semaphore = 限制同時 N 個 fail-fast = 0 message 即走
async sync_messages(channel_id, sync_func, max_concurrent=3)。// 派 channel 嘅 message 去外部
每條 message call sync_func(msg_dict)(async,return string 或 raise)。// 每條 sync
asyncio.Semaphore(max_concurrent) 限制同時幾多個。// rate limit
Fail-fast:channel 0 條 message(purge 後)→ 即 return [],唔 call sync_func。// empty short-circuit
全部 message 都要試(個別失敗唔停低),return {"msg_id","status","result"} list。// 都要 attempt
async def sync_messages(self, channel_id, sync_func, max_concurrent=3): # 並發 sync messages(semaphore)
self._purge_expired_messages(channel_id) # 第一步:先踢走過期訊息,免得拎住已過鐘嘅單去同步
if channel_id not in self.channels: # 連群組都唔存在,代表冇任何 message 可以派出去
return [] # 直接返空 list;caller 一眼知今次完全冇同步工作做
msg_ids = self.channels[channel_id]["message_ids"] # 拎住群組公告板上現存嘅 msg_id 清單
# 第二步:fail-fast 檢查,呢個群如果一條 message 都冇就唔好白開工
if len(msg_ids) == 0: # 群組空空如也,等於冇貨要出倉
return [] # 即刻短路;唔開 semaphore、唔砌 tasks、唔 call sync_func
# 第三步:先抄一份 message snapshot 出嚟,之後畀外部 sync_func 安心用
to_sync = [] # 之後每條元素都係一張準備出倉嘅 message 單
for mid in msg_ids: # 沿住 channel 時間線逐條 message 檢貨
if mid in self.messages: # 防止中途遇到失效 id;只同步真係仍然存在嘅 message
msg = self.messages[mid] # 攞 message 本體,準備抄成對外 payload
to_sync.append({ # 對外只交 snapshot;唔直接交內部 reference
"msg_id": msg["msg_id"], # 呢條 message 嘅票尾
"channel_id": msg["channel_id"], # 來源群組
"user_id": msg["user_id"], # 邊個講嘅
"content": msg["content"], # 真正要同步出去嘅內容
"timestamp": msg["timestamp"], # 發送時間,方便外部系統重排
})
semaphore = asyncio.Semaphore(max_concurrent) # 出貨閘口最多同時放 N 張單出去
# 第四步:定義單一 message 點樣同步;個別失敗都唔會拖冧成批
async def _sync_one(msg_dict): # 每條 message 都會經過呢個 helper
async with semaphore: # 攞到 quota 先可以 call 外部 sync_func
try: # 單條 message 自己包住 try/except;一條出事唔會影響其他貨
result = await sync_func(msg_dict) # 真正送去外部系統,同步結果由外面決定
return { # 成功就返一張報告單,講明邊條 msg 同步成功
"msg_id": msg_dict["msg_id"], # 邊張 message 單完成咗
"status": "success", # 明確話 caller 知呢張單係成功出貨
"result": result, # 外部系統返嚟嘅回執/識別碼
}
except Exception as e: # 外部系統出錯都只記錄呢一張單,不會停全批
return { # 失敗都交返報告單;caller 可以逐條睇邊張死喺邊
"msg_id": msg_dict["msg_id"], # 邊張 message 單出問題
"status": "error", # 明確標記為同步失敗
"result": str(e), # 錯誤內容包返出嚟,方便 caller 做重試或記錄
}
# 第五步:將所有 message 單一齊派出去,同步完成後整批回報
tasks = [] # 收集全部待同步 coroutine
for md in to_sync: # 每條 message snapshot 都排入出貨盤
tasks.append(_sync_one(md)) # 保留原本順序;等 gather 之後方便對返輸入
results = await asyncio.gather(*tasks) # 全批同時開工,由 semaphore 控住同時出幾多張單
return list(results) # 按原順序交返每條 message 嘅成功/失敗報告;caller 一眼見到整批同步結果
TaskQueue L6 dispatch_external:
Fail-fast per task:唔合格嗰個 即 return False(唔入 sem)
合格嗰個入 sem + sleep + DISPATCHED
Chat L6 sync_messages:
Fail-fast 係 channel-level:
channel 完全冇 message → 即 return []
冇開 semaphore,冇開 gather
有 message 嘅每條都要試(即使有 raise 都繼續其他)
即係兩種 fail-fast 喺唔同 layer:
TaskQueue:個別 task 層
Chat :整個 channel 層("冇嘢做就唔好開 infra")
def __init__(self,
message_ttl_ms=None,
inactive_ttl_ms=None):
self.channels = {}
self.messages = {}
self.user_activity = {}
self.msg_counter = 0
self.message_ttl_ms = message_ttl_ms
self.inactive_ttl_ms = inactive_ttl_ms
self.locks = defaultdict(asyncio.Lock)
L6 全部係 method(async sync_messages)
唔加新 dict,唔加新 field
state:
channels["#lunch"]["message_ids"] = ["m1","m2","m3","m4"]
全部都喺 self.messages
sync_messages("#lunch", my_sync, max_concurrent=2):
Step 1: _purge_expired_messages("#lunch") → 冇過期
Step 2: msg_ids = ["m1","m2","m3","m4"],len > 0 ✅
Step 3: to_sync = [4 個 message dict]
Step 4: semaphore = Semaphore(2)
Step 5: 4 個 _sync_one coroutine gather
Time 0: m1 + m2 入 sem,call sync_func
Time 0.x: m1 done → 還 → m3 入 sem
Time 0.y: m2 done → 還 → m4 入 sem
Time 0.z: m3 + m4 done
Step 6: results = [
{"msg_id":"m1","status":"success","result":"OK"},
{"msg_id":"m2","status":"error", "result":"timeout"},
{"msg_id":"m3","status":"success","result":"OK"},
{"msg_id":"m4","status":"success","result":"OK"},
]
# 注意:m2 raise 都唔停低其他
state:
channels["#empty"]["message_ids"] = []
sync_messages("#empty", my_sync) :
Step 1: _purge_expired → 冇嘢清
Step 2: msg_ids = []
Step 3: len(msg_ids) == 0 → return []
# 完全冇 call sync_func
# 冇開 semaphore
# 冇開 gather
# Spec 要求嘅 "immediate empty return"
# 點解?外部 API 通常有 connection cost。
# 0 條 message 都連去 setup 純粹浪費。
即使有 message 都全部 await sleep
max_concurrent=3,9 條 message
每條 sleep 0.01
一共要等 ⌈9/3⌉ × 0.01 = 0.03 秒
如果 fail-fast 寫漏:
0 條 message 都跑 gather([]) → 即時 return []
實際上唔影響時間,但 spec 要求 explicit short-circuit
即係 if len(msg_ids) == 0: return []
監控中心,收集所有 server 嘅 log。好似 Datadog / Splunk 咁。每條 log 有 timestamp、source(邊個 server)、level(DEBUG/INFO/WARN/ERROR)同 message。
想像監控中心收 log:
┌─────────────────────────────────────────────────┐
│ log_1: ts=100 source="web-1" level=INFO ... │
│ log_2: ts=101 source="db-1" level=ERROR ... │
│ log_3: ts=102 source="web-1" level=WARN ... │
│ log_4: ts=103 source="web-2" level=ERROR ... │
└─────────────────────────────────────────────────┘
每條 log 有:
log_id = 編號("log_1")
timestamp = 幾時發生
source = 邊個 server 出嘅
level = DEBUG / INFO / WARN / ERROR
message = log 內容
要查嘅問題:
1. 攞所有 ERROR log(按 timestamp 排序)
2. 攞 web-1 嘅所有 log
3. 邊個 source 出最多 log?
# 解法:multi-collection pattern
# 一份 master + 兩個索引(index)
self.logs # master: 所有 log
self.by_source # 索引:source → [log_id]
self.by_level # 索引:level → [log_id]
# 點解要三份?
# 如果淨係用 master list,每次 filter_by_level
# 都要掃晒全部 log,慢。
# 加索引:直接 by_level["ERROR"] 攞 log_id list
# 後面 level 加多啲嘢:
# L2 filter / top_sources
# L3 retention TTL(過期清走)
# L4 count_by_level / snapshot / restore
# L5 batch_add(lock per source)
# L6 export_logs(ALL-SLEEP,全部都 export)
import time
import copy
import asyncio
from collections import defaultdict
class LogAggregator:
def __init__(self, retention_ms=None):
self.logs = {} # L1 master:log_id → log dict
self.by_source = defaultdict(list) # L1 索引:source → [log_id]
self.by_level = defaultdict(list) # L1 索引:level → [log_id]
self.log_counter = 0 # L1 寫到第幾條
self.retention_ms = retention_ms # L3 加:過期幾耐
self.snapshots = {} # L4 加
self.snap_counter = 0 # L4 加
self.locks = defaultdict(asyncio.Lock) # L5 加
self.logs = {
"log_1": {"log_id":"log_1","timestamp":100,
"source":"web-1","level":"INFO",
"message":"hello"},
"log_2": {"log_id":"log_2","timestamp":101,
"source":"db-1","level":"ERROR",
"message":"crash"},
"log_3": {"log_id":"log_3","timestamp":102,
"source":"web-1","level":"WARN",
"message":"slow"},
}
self.by_source = {
"web-1": ["log_1", "log_3"], # web-1 出兩條
"db-1": ["log_2"], # db-1 出一條
}
self.by_level = {
"INFO": ["log_1"],
"ERROR": ["log_2"],
"WARN": ["log_3"],
}
add_log 嗰陣要同時加 3 個地方
delete_log 嗰陣要同時清 3 個地方
一旦唔 sync,filter 就會出錯(攞到唔存在嘅 log_id)
補救方法(reference solution 用):
filter 嗰陣 double-check log_id 仲喺 self.logs
for log_id in self.by_source[source]:
if log_id in self.logs: # ← 防呆
...
self.logs = {}
self.by_source = defaultdict(list)
self.by_level = defaultdict(list)
self.log_counter = 0
# 後面 level 加:retention_ms (L3), snapshots (L4), locks (L5)
def _now_ms(self): # 而家係幾多毫秒(L3 用)
return time.time() * 1000 # 將計好嘅值交返 caller;之後外面會靠呢個結果再判斷
def _purge_expired(self): # L3 加:清走過期嘅 log(lazy expiry)
if self.retention_ms is None: # 冇設 TTL → 唔使做嘢
return # 返(冇 return 值)
now = self._now_ms() # 攞而家時間
expired_ids = [] # 用嚟裝過期嘅 log_id
for log_id in self.logs: # 逐條 log 睇
log = self.logs[log_id] # 攞 logs 入面嘅值
age = now - log["timestamp"] # 用 log 自己嘅 timestamp 比較
if age > self.retention_ms: # 太老 → 加入清除名單
expired_ids.append(log_id) # 將呢項塞入 list,留待之後一齊處理或回傳
for log_id in expired_ids: # 逐個清
self._remove_log(log_id) # 先交俾 helper 做前置清理或者共用邏輯;主 function 再接手
def _remove_log(self, log_id): # 從 3 個 collection 一次過清走
if log_id not in self.logs: # log_id 唔存在
return # 返(冇 return 值)
log = self.logs[log_id] # 攞 logs 入面嘅值
source = log["source"] # 攞 log 入面嘅值
level = log["level"] # 攞 log 入面嘅值
# 從 by_source 移除呢個 log_id
if source in self.by_source: # source 已經存在
new_list = [] # 先開個空 list,等陣逐項放結果或工作入去
for lid in self.by_source[source]: # 逐項巡一次;每個元素都會照同一套規則處理
if lid != log_id: # 檢查 lid 唔等於
new_list.append(lid) # 將呢項塞入 list,留待之後一齊處理或回傳
self.by_source[source] = new_list # 記低
if len(self.by_source[source]) == 0: # 檢查長度
del self.by_source[source] # 由 by_source 移走
# 從 by_level 移除呢個 log_id
if level in self.by_level: # level 已經存在
new_list = [] # 先開個空 list,等陣逐項放結果或工作入去
for lid in self.by_level[level]: # 逐項巡一次;每個元素都會照同一套規則處理
if lid != log_id: # 檢查 lid 唔等於
new_list.append(lid) # 將呢項塞入 list,留待之後一齊處理或回傳
self.by_level[level] = new_list # 記低
if len(self.by_level[level]) == 0: # 檢查長度
del self.by_level[level] # 由 by_level 移走
# 最後從 master 移除
del self.logs[log_id] # 由 logs 移走
# 例:retention_ms = 1000(一秒)
# now = 5000ms
self.logs = {
"log_1": {"timestamp": 3000, ...}, # age=2000 → 過期
"log_2": {"timestamp": 4500, ...}, # age=500 → 仲喺度
"log_3": {"timestamp": 1000, ...}, # age=4000 → 過期
}
# 行完 _purge_expired:
# expired_ids = ["log_1", "log_3"]
# 然後逐個 _remove_log
# 最後 self.logs = {"log_2": {...}}
例:刪 log_1 (source="web-1", level="INFO")
Step 1: 從 by_source["web-1"] 移除 "log_1"
原本 ["log_1", "log_3"] → ["log_3"]
Step 2: 從 by_level["INFO"] 移除 "log_1"
原本 ["log_1"] → []
list 空咗 → del self.by_level["INFO"]
Step 3: del self.logs["log_1"]
3 個 collection 全部 sync
Lazy = 唔開 timer 主動清,等到有人查嘅時候先清
好處:唔使 background thread,簡單
壞處:如果冇人 read,過期 log 會留住
所有 read function (get/filter/top_sources/count) 開頭都要
self._purge_expired()
log = 一條紀錄 source = 邊個 server level = 嚴重程度 counter = 計數器
def add_log(self, timestamp, source, level, message): # 收一條新 log
self.log_counter = self.log_counter + 1 # counter +1
log_id = "log_" + str(self.log_counter) # 砌 log_id("log_1"、"log_2"...)
log = { # 將新值寫落呢格 state;等於而家正式更新咗紀錄
"log_id": log_id, # 呢條 log 自己嘅編號;之後 get/filter/export 都靠佢識別
"timestamp": timestamp, # 事件發生時間;TTL 清理同時間排序都靠呢格
"source": source, # 邊個系統或者 service 打出呢條 log
"level": level, # 呢條 log 屬於 INFO / WARN / ERROR 邊一級
"message": message, # 真正文字內容;即人眼最終會睇到嘅訊息
}
self.logs[log_id] = log # 加入 master
self.by_source[source].append(log_id) # 加入 by_source 索引
self.by_level[level].append(log_id) # 加入 by_level 索引
return log_id # 返 log_id
def get_log(self, log_id): # 查一條 log
self._purge_expired() # 先清過期(L3 用,L1 stub)
if log_id not in self.logs: # log_id 唔存在
return None # 唔存在或者過期咗 → None
log = self.logs[log_id] # 攞 logs 入面嘅值
return { # 回傳一份 dict 快照;caller 可以直接睇欄位內容
"log_id": log["log_id"], # 告訴 caller 呢份快照原來係邊條 log
"timestamp": log["timestamp"], # 將事件時間抄出去,方便外面再排前後
"source": log["source"], # 邊個來源打出呢條 log
"level": log["level"], # 嚴重程度一齊帶返出去
"message": log["message"], # 真正 log 文本內容
}
def delete_log(self, log_id): # 刪一條 log
if log_id not in self.logs: # log_id 唔存在
return False # 唔存在 → False
self._remove_log(log_id) # helper 同時清 3 個 collection
return True # 成功就返 True;caller 可以當今次動作真係做咗
def __init__(self):
self.logs = {}
self.by_source = defaultdict(list)
self.by_level = defaultdict(list)
self.log_counter = 0
# add_log(100, "web-1", "INFO", "hello") 之後:
self.logs = {
"log_1": {
"log_id": "log_1",
"timestamp": 100,
"source": "web-1",
"level": "INFO",
"message": "hello",
},
}
self.by_source = {"web-1": ["log_1"]}
self.by_level = {"INFO": ["log_1"]}
self.log_counter = 1
有人話:「我想加條 log,source = web-1,level = INFO,message = hello。」
1. counter +1 → 2
2. log_id = "log_2"
3. 砌個 dict,5 個 field 齊
4. 寫入 master self.logs["log_2"] = log
5. by_source["web-1"].append("log_2") → ["log_1", "log_2"]
6. by_level["INFO"].append("log_2") → ["log_1", "log_2"]
7. return "log_2"
get_log("log_1") → {"log_id":"log_1","timestamp":100,
"source":"web-1","level":"INFO",
"message":"hello"}
get_log("log_999") → None
# 注意:return new dict copy,唔好直接 return self.logs[log_id]
# 否則 caller 改個 dict 會污染內部 state
delete_log("log_1")
# _remove_log("log_1") 一次過清 3 個地方:
# by_source["web-1"]: ["log_1"] → [] → del by_source["web-1"]
# by_level["INFO"]: ["log_1"] → [] → del by_level["INFO"]
# del self.logs["log_1"]
# return True
delete_log("log_999") → False
filter_by_level = 按級別篩 filter_by_source = 按 server 篩 top_sources = 邊個 server 出最多 log
def filter_by_level(self, level): # 攞某個 level 嘅所有 log(按 timestamp 升序)
self._purge_expired() # 先清過期
if level not in self.by_level: # level 唔存在
return [] # 從來冇呢個 level → 空 list
result = [] # 先開個空 list,等陣逐項放結果或工作入去
for log_id in self.by_level[level]: # 逐條 log 睇,level 啱嘅就加入 result
if log_id in self.logs: # 防呆:log 仲喺 master 度
log = self.logs[log_id] # 攞 logs 入面嘅值
result.append({ # 將呢項塞入 list,留待之後一齊處理或回傳
"log_id": log["log_id"], # 呢條結果對應邊個 log 編號
"timestamp": log["timestamp"], # 保留事件時間,方便 caller 之後再排序或顯示
"source": log["source"], # 呢條 log 來自邊個 service
"level": log["level"], # 呢條 log 嘅嚴重程度
"message": log["message"], # 呢條 log 真正內容
})
# 按 timestamp 升序排(手寫 bubble sort,唔用 list comp)
for i in range(len(result)): # loop i 次
for j in range(i + 1, len(result)): # loop j 次
if result[j]["timestamp"] < result[i]["timestamp"]: # 呢度係分流位;條件唔同就會走去唔同分支
result[i], result[j] = result[j], result[i] # 對調前後位置,令 filter 結果真係按 timestamp 升序
return result # 將整理好嘅結果交返 caller;外面就拎住呢份清單或報表去用
def filter_by_source(self, source): # 攞某個 source 嘅所有 log(按 timestamp 升序)
self._purge_expired() # 先交俾 helper 做前置清理或者共用邏輯;主 function 再接手
if source not in self.by_source: # source 唔存在
return [] # 空 list
result = [] # 先開個空 list,等陣逐項放結果或工作入去
for log_id in self.by_source[source]: # 逐條 log 睇,source 啱嘅就加入 result
if log_id in self.logs: # log_id 已經存在
log = self.logs[log_id] # 攞 logs 入面嘅值
result.append({ # 將呢項塞入 list,留待之後一齊處理或回傳
"log_id": log["log_id"], # 呢條結果對應邊個 log 編號
"timestamp": log["timestamp"], # 保留事件時間,方便 caller 之後再排序或顯示
"source": log["source"], # 呢條 log 來自邊個 service
"level": log["level"], # 呢條 log 嘅嚴重程度
"message": log["message"], # 呢條 log 真正內容
})
for i in range(len(result)): # loop i 次
for j in range(i + 1, len(result)): # loop j 次
if result[j]["timestamp"] < result[i]["timestamp"]: # 呢度係分流位;條件唔同就會走去唔同分支
result[i], result[j] = result[j], result[i] # 對調前後位置,令 filter 結果真係按 timestamp 升序
return result # 將整理好嘅結果交返 caller;外面就拎住呢份清單或報表去用
def top_sources(self, n): # 出最多 log 嘅頭 N 個 source
self._purge_expired() # 先交俾 helper 做前置清理或者共用邏輯;主 function 再接手
counts = [] # 先開個空 list,等陣逐項放結果或工作入去
for source in self.by_source: # 逐個 source 數有幾多條未過期 log
count = 0 # 從 0 計起
for log_id in self.by_source[source]: # 逐項巡一次;每個元素都會照同一套規則處理
if log_id in self.logs: # log_id 已經存在
count = count + 1 # 將新值寫落呢格 state;等於而家正式更新咗紀錄
if count > 0: # 呢度係分流位;條件唔同就會走去唔同分支
counts.append((source, count)) # 將呢項塞入 list,留待之後一齊處理或回傳
# 排序:count 降序,同 count 嗰陣 source 字母升序
for i in range(len(counts)): # loop i 次
for j in range(i + 1, len(counts)): # loop j 次
swap = False # 將新值寫落呢格 state;等於而家正式更新咗紀錄
if counts[j][1] > counts[i][1]: # count 大嘅排先
swap = True # 將新值寫落呢格 state;等於而家正式更新咗紀錄
elif counts[j][1] == counts[i][1]: # 另一個情況
if counts[j][0] < counts[i][0]: # 同 count → 字母細嘅排先
swap = True # 將新值寫落呢格 state;等於而家正式更新咗紀錄
if swap: # 如果 swap 為真
counts[i], counts[j] = counts[j], counts[i] # 將較應該排前嗰個 source 換上去,結果榜先會對
result = [] # 先開個空 list,等陣逐項放結果或工作入去
for k in range(min(n, len(counts))): # 攞頭 N 個
source = counts[k][0] # 攞 counts 入面嘅值
count = counts[k][1] # 攞 counts 入面嘅值
result.append(source + "(" + str(count) + ")") # "source(count)" 格式
return result # 將整理好嘅結果交返 caller;外面就拎住呢份清單或報表去用
def __init__(self):
self.logs = {}
self.by_source = defaultdict(list)
self.by_level = defaultdict(list)
self.log_counter = 0
唔使加 field,已經有齊 timestamp / source / level
排序行 timestamp / count,喺 read 嗰陣即場計
# 假設:
self.logs = {
"log_1": {"timestamp":100,"level":"ERROR",...},
"log_2": {"timestamp":50, "level":"ERROR",...},
"log_3": {"timestamp":75, "level":"INFO", ...},
}
self.by_level = {
"ERROR": ["log_1", "log_2"],
"INFO": ["log_3"],
}
filter_by_level("ERROR"):
# 1. _purge_expired()
# 2. by_level["ERROR"] = ["log_1", "log_2"]
# 3. result = [log_1_dict, log_2_dict] (timestamp 100, 50)
# 4. bubble sort by timestamp asc
# → [log_2_dict (ts=50), log_1_dict (ts=100)]
# 5. return result
# 假設:
self.by_source = {
"web-1": ["log_1", "log_2", "log_3"], # 3
"db-1": ["log_4"], # 1
"api-1": ["log_5", "log_6"], # 2
"auth": ["log_7", "log_8"], # 2
}
top_sources(3):
# 1. counts = [("web-1",3),("db-1",1),("api-1",2),("auth",2)]
# 2. sort by (-count, source):
# ("web-1", 3) ← count 最大
# ("api-1", 2) ← 同 count,字母細
# ("auth", 2)
# ("db-1", 1)
# 3. result = ["web-1(3)", "api-1(2)", "auth(2)"]
retention = 保留 TTL = time to live lazy expiry = 等查嘅時候先清 用 log 自己 timestamp 比較 now
def __init__(self, retention_ms=None): # L3 改:constructor 收 retention_ms
self.logs = {} # 重設 self.logs
self.by_source = defaultdict(list) # 將新值寫落呢格 state;等於而家正式更新咗紀錄
self.by_level = defaultdict(list) # 將新值寫落呢格 state;等於而家正式更新咗紀錄
self.log_counter = 0 # 更新 self.log_counter
self.retention_ms = retention_ms # L3 加:過期幾耐(毫秒)
# _purge_expired 已經喺 Helpers 寫咗,L3 開始所有 read function 開頭都 call
# get_log / filter_by_level / filter_by_source / top_sources 全部
# self._purge_expired()
# 想像 _purge_expired 點 work(再貼一次):
def _purge_expired(self): # 清走過期 log(lazy);唔係背景刪,係有人讀 aggregation 前先順手清舊紀錄
if self.retention_ms is None: # 冇設 → 唔做嘢
return # 返(冇 return 值)
now = self._now_ms() # time.time() * 1000
expired_ids = [] # 先開個空 list,等陣逐項放結果或工作入去
for log_id in self.logs: # 逐條 log 睇 age
log = self.logs[log_id] # 攞 logs 入面嘅值
age = now - log["timestamp"] # age 用 log 自己 timestamp 比較
if age > self.retention_ms: # 呢度係分流位;條件唔同就會走去唔同分支
expired_ids.append(log_id) # 將呢項塞入 list,留待之後一齊處理或回傳
for log_id in expired_ids: # 逐項巡過去;每次處理一個元素
self._remove_log(log_id) # 三個 collection 一齊清
def __init__(self, retention_ms=None):
self.logs = {}
self.by_source = defaultdict(list)
self.by_level = defaultdict(list)
self.log_counter = 0
self.retention_ms = retention_ms ← L3 加
# log dict 一樣有 timestamp field
# 只係 self 多咗 retention_ms
self.retention_ms = 1000 ← 過期就 1 秒
# 重要:用 log 嘅 timestamp,唔係 add_log 嗰陣嘅時間
# log["timestamp"] 係 caller 傳入嘅,可以同 now 唔同
now = time.time() * 1000 # e.g. 5000
log = {"timestamp": 3000, ...} # 兩秒前
age = now - log["timestamp"] # 2000
if age > self.retention_ms: # e.g. retention=1000
# 2000 > 1000 → 過期
# retention_ms = 1000,now = 5000
self.logs = {
"log_1": {"timestamp": 4500, ...}, # age=500 仲喺度
"log_2": {"timestamp": 3000, ...}, # age=2000 過期
"log_3": {"timestamp": 4900, ...}, # age=100 仲喺度
}
filter_by_level("INFO"):
# Step 1: _purge_expired()
# 逐條 log 睇 age → log_2 過期
# expired_ids = ["log_2"]
# _remove_log("log_2") → 三個 collection 一齊清
# Step 2: 從 by_level["INFO"] 攞 log_id
# Step 3: 砌 result list
constructor default:retention_ms=None
即係冇設 TTL → log 永遠唔過期
呢個 case 慳一個 for loop 嘅 overhead
直接 return,唔掃 self.logs
count_by_level = 數每個 level 有幾多條 snapshot = 影張相 restore = 還原 deepcopy = 深層 copy
def count_by_level(self): # 數每個 level 有幾多條未過期 log
self._purge_expired() # 第一步:先掃走過期 log,等最後張統計表只計仍然生效嗰批
result = {} # 呢個 dict 會係最後交畀 caller 嘅 level 點名簿
for level in self.by_level: # 逐個嚴重程度數人頭,好似逐個桶點算仲剩幾多張單
count = 0 # 每個 level 由零開始重數
for log_id in self.by_level[level]: # 索引入面可能仲留住舊 id,所以逐個再核對一次 master logs
if log_id in self.logs: # 真係仲喺主帳本先算數;避免將已刪/已過期 log 再計多次
count = count + 1 # 呢個 level 又多一條有效 log
if count > 0: # 零條嗰個 level 唔放入結果,等輸出簡潔啲
result[level] = count # 將點完數嘅結果記入報表,例如 "ERROR": 3
return result # 交返一張 level→count 報表;caller 一眼睇到邊種嚴重程度最多
def snapshot(self): # 影一張相,俾你之後 restore
self._purge_expired() # 第一步:先清垃圾,再影相;唔想將過期 log 影入備份
self.snap_counter = self.snap_counter + 1 # 每影一次相就派一個新 snapshot 編號
snap_id = "snap_" + str(self.snap_counter) # 組出今次快照張相嘅名字,例如 snap_3
# 第二步:將當前 logs 深拷貝一份,好似影低當刻全景相
logs_copy = copy.deepcopy(self.logs) # deepcopy 後就算之後改現場,張相入面內容都唔會一齊變
# 第三步:用 copy 出嚟嗰批 log 重砌索引,確保張相入面三個 collection 互相對得返
by_source_copy = defaultdict(list) # snapshot 專屬 source 索引;等於另外抄一本按來源分類嘅簿
by_level_copy = defaultdict(list) # snapshot 專屬 level 索引;避免 restore 後要即場再重算
for log_id in logs_copy: # 張相入面每條 log 都重新掛返去兩本索引簿
log = logs_copy[log_id] # 拎住呢條 log 嘅 copy,睇佢屬於邊個 source / level
by_source_copy[log["source"]].append(log_id) # 將 log_id 掛去對應 source 桶,方便之後 restore 完即刻可查
by_level_copy[log["level"]].append(log_id) # 同一時間按 level 再掛一次;張相應該係完整可用嘅狀態
# 第四步:將呢三份 copy 收入 snapshots 櫃桶
self.snapshots[snap_id] = { # 存低一個完整凍結狀態,之後 restore 可以原封不動回帶
"logs": logs_copy, # 當刻主帳本快照
"by_source": by_source_copy, # 當刻 source 索引快照
"by_level": by_level_copy, # 當刻 level 索引快照
}
return snap_id # 交返今次快照編號;caller 之後拎住佢先知道要回帶去邊一張相
def restore(self, snapshot_id): # 用 snapshot 覆蓋當前 state
if snapshot_id not in self.snapshots: # 想回帶去嘅相根本唔存在,就冇得還原現場
return False # 直接話 restore 失敗;caller 應該知道呢個 snapshot_id 無效
snap = self.snapshots[snapshot_id] # 先拎出嗰張歷史相;入面包含當時完整三本簿
# 第一步:將當前現場整個蓋過,回到 snapshot 嗰刻
self.logs = copy.deepcopy(snap["logs"]) # 主帳本回帶;之後新加過嘅 log 會一齊消失
self.by_source = copy.deepcopy(snap["by_source"]) # source 索引都同步回帶,避免帳本同索引唔一致
self.by_level = copy.deepcopy(snap["by_level"]) # level 索引一樣照相還原;之後 count/filter 先會對數
return True # restore 完成;caller 可以當成個 log 系統已經回到舊狀態
def __init__(self, retention_ms=None):
self.logs = {}
self.by_source = defaultdict(list)
self.by_level = defaultdict(list)
self.log_counter = 0
self.retention_ms = retention_ms
self.snapshots = {} ← L4 加
self.snap_counter = 0 ← L4 加
self.snapshots = {
"snap_1": {
"logs": {"log_1": {...}, "log_2": {...}}, # deep copy
"by_source": {"web-1": ["log_1"], ...},
"by_level": {"INFO": ["log_1"], ...},
},
"snap_2": { ... },
}
self.snap_counter = 2
# self.by_level = {
# "ERROR": ["log_1", "log_2"],
# "INFO": ["log_3"],
# "WARN": [], ← 全部已過期 / 刪除
# }
count_by_level():
# ERROR: count=2 → result["ERROR"]=2
# INFO: count=1 → result["INFO"]=1
# WARN: count=0 → 唔加入 result
# return {"ERROR": 2, "INFO": 1}
如果直接 self.snapshots[id] = self.logs
之後 self.logs["log_5"] = {...} 會污染 snapshot
因為兩個 variable 指同一個 dict
deepcopy = 連入面個 dict 都 copy 一份
即使 self.logs["log_1"]["message"] = "changed"
snapshot 入面 "log_1" 嘅 message 唔變
# 影相之前:
self.logs = {"log_1": {...}, "log_2": {...}}
self.by_source = {"web-1": ["log_1", "log_2"]}
self.by_level = {"INFO": ["log_1", "log_2"]}
snapshot():
# 1. _purge_expired() 先(snapshot 唔影過期嘢)
# 2. snap_counter += 1 → snap_id = "snap_1"
# 3. logs_copy = deepcopy(self.logs)
# 4. 重建索引(從 logs_copy 行 for loop)
# 5. 存入 self.snapshots["snap_1"]
# return "snap_1"
# 之後加咗條新 log:
self.logs["log_3"] = {...}
restore("snap_1"):
# 1. snap = self.snapshots["snap_1"]
# 2. self.logs = deepcopy(snap["logs"])
# → 而家 self.logs 得返 log_1 同 log_2,log_3 冇咗
# 3. 同樣 deepcopy by_source / by_level
# return True
restore("snap_999") → False
async = 非同步 batch = 一次過做幾個 lock per source = 每個 source 一把鎖 gather = 同時做晒
async def batch_add(self, operations): # 一次過加多條 log,concurrent 但 lock per source
async def _single_add(timestamp, source, level, message): # 加一條 log entry(async helper)
async with self.locks[source]: # 同一 source 排隊(保證 log_id 順序)
return self.add_log(timestamp, source, level, message) # 返 self.add_log 嘅值
tasks = [] # 先開個空 list,等陣逐項放結果或工作入去
for timestamp, source, level, message in operations: # 逐個 tuple 拆,砌 coroutine
tasks.append(_single_add(timestamp, source, level, message)) # 將呢項塞入 list,留待之後一齊處理或回傳
results = await asyncio.gather(*tasks) # 同時做晒
return list(results) # 順序同 operations 一樣
def __init__(self, retention_ms=None):
self.logs = {}
self.by_source = defaultdict(list)
self.by_level = defaultdict(list)
self.log_counter = 0
self.retention_ms = retention_ms
self.snapshots = {}
self.snap_counter = 0
self.locks = defaultdict(asyncio.Lock) ← L5 加
self.locks = {
"web-1": <asyncio.Lock>, # web-1 嘅鎖
"db-1": <asyncio.Lock>, # db-1 嘅鎖
}
# defaultdict:第一次 self.locks["new-source"] 自動造一把 Lock
operations = [
(100, "web-1", "INFO", "a"), # op 0
(101, "db-1", "ERROR", "b"), # op 1
(102, "web-1", "WARN", "c"), # op 2
(103, "db-1", "INFO", "d"), # op 3
]
batch_add(operations):
# 砌 4 個 _single_add coroutine
# asyncio.gather 同時做晒
# op 0 攞 web-1 鎖 → add_log → log_1
# op 1 攞 db-1 鎖 → add_log → log_2 (同時)
# op 2 等 web-1 鎖(op 0 攞住)→ 釋放後 add_log → log_3
# op 3 等 db-1 鎖(op 1 攞住)→ 釋放後 add_log → log_4
# return ["log_1", "log_2", "log_3", "log_4"]
# 順序同 operations 一樣
add_log 入面:
self.log_counter += 1
log_id = "log_" + str(self.log_counter)
self.logs[log_id] = log
如果兩個 coroutine 同時 access self.log_counter,可能撞 id
但係 lock per source 已經夠?
因為 Python asyncio 單 thread,self.log_counter += 1
之間冇 await,唔會 yield。所以實際上唔會 race。
題目要求 lock per source,跟住做就好
export = 推出去外部 semaphore = 限制同時開幾多 ALL-SLEEP = 每條 log 都試 export,唔篩 level
async def export_logs(self, export_func, max_concurrent=3): # 推所有 log 去外部,限制 concurrent
self._purge_expired() # 第一步:先踢走過期 log,只將仍然有效嗰批送出去
# 第二步:將現場所有有效 log 抄成 export payload;ALL-SLEEP 代表唔會先篩走任何 level
to_export = [] # 之後每條元素都係一張準備出貨嘅 log 單
for log_id in self.logs: # 主帳本有幾條,就逐條都入出貨盤
log = self.logs[log_id] # 攞出呢條 log 本體,準備抄成對外格式
to_export.append({ # 交畀外部前先複製一份快照;外部改佢都唔會污染內部 state
"log_id": log["log_id"], # 呢條 log 嘅編號
"timestamp": log["timestamp"], # 事件發生時間
"source": log["source"], # 邊個系統打出嚟
"level": log["level"], # INFO / WARN / ERROR 等級
"message": log["message"], # 真正要輸出嘅文字內容
})
semaphore = asyncio.Semaphore(max_concurrent) # 同一時間最多只准 N 條 log 走出去外部系統
# 第三步:定義單一 log 點樣 export;一條出事唔會拖冧成批
async def _export_one(log_dict): # 每條 log 都會經過呢個 helper
async with semaphore: # 攞到出口配額先真正 call export_func
try: # 個別 exception 就地處理;唔會因一條壞單停晒全隊
result = await export_func(log_dict) # 真正送去外部;對方回執係成功結果
return { # 成功就交返一張匯報單,講明邊條 log 出貨成功
"log_id": log_dict["log_id"], # 邊條 log 完成咗 export
"status": "success", # caller 可以一眼分辨呢條係成功
"result": result, # 外部系統返嚟嘅回執/識別碼
}
except Exception as e: # 外部出口卡住都只記錄呢一條;其他照樣繼續出貨
return { # 失敗都交返報告,等 caller 之後可以重試或追查
"log_id": log_dict["log_id"], # 邊條 log 出問題
"status": "error", # 明確標記為 export 失敗
"result": str(e), # 錯誤內容一併帶返,方便 caller 做追蹤
}
# 第四步:將所有 log 單一齊派出去,完成後整批回報
tasks = [] # 收集全部待 export coroutine
for ld in to_export: # 每條 log snapshot 都排入出貨盤
tasks.append(_export_one(ld)) # 保留原本順序;等 gather 完方便對返輸入
results = await asyncio.gather(*tasks) # 全批同時開工,由 semaphore 控住同時出幾多條 log
return list(results) # 按原順序交返每條 log 嘅成功/失敗報告;caller 一眼見到整批 export 結果
def __init__(self, retention_ms=None):
self.logs = {}
self.by_source = defaultdict(list)
self.by_level = defaultdict(list)
self.log_counter = 0
self.retention_ms = retention_ms
self.snapshots = {}
self.snap_counter = 0
self.locks = defaultdict(asyncio.Lock)
ALL-SLEEP 意思:每條 log 都會「瞓一覺」(await export_func)
唔似 Chat L6 fail-fast(0 messages 即刻 return)
唔似 Bank L6 conditional(睇條件先 schedule)
LogAgg:100 條 log → 100 個 export 嘗試
即使 export_func raise exception,個 entry 都會出現喺 result
唔會 skip 任何一條
Semaphore(3) = 同一時間最多 3 個喺度做嘢
第 4 個 coroutine 要等其中一個釋放
想像 5 條 log,max_concurrent=3:
t=0: log_1, log_2, log_3 攞到 semaphore,開始 await
t=1: log_4, log_5 等緊
t=2: log_2 完成,釋放 → log_4 攞到,開始
t=3: log_1 完成,釋放 → log_5 攞到,開始
最終 5 個 entry 全部喺 result 度
async def fake_export(log_dict):
if log_dict["level"] == "ERROR":
raise Exception("export failed")
return f"sent {log_dict['log_id']}"
# 假設:
self.logs = {
"log_1": {"level": "INFO", ...},
"log_2": {"level": "ERROR", ...},
"log_3": {"level": "WARN", ...},
}
await export_logs(fake_export, max_concurrent=2):
# 1. _purge_expired()
# 2. to_export = [log_1_dict, log_2_dict, log_3_dict]
# (全部都收,唔篩 level)
# 3. 3 個 _export_one coroutine(受 Semaphore(2) 限制)
# 4. log_1: success, result="sent log_1"
# log_2: error, result="export failed"
# log_3: success, result="sent log_3"
# 5. return [
# {"log_id":"log_1","status":"success","result":"sent log_1"},
# {"log_id":"log_2","status":"error", "result":"export failed"},
# {"log_id":"log_3","status":"success","result":"sent log_3"},
# ]
export_func 係 user code,可能改個 dict
如果直接傳 self.logs[log_id],會污染內部 state
所以 copy 一份新 dict 畀 user
考試見到唔熟嘅 domain?搵返類似嘅睇下 L1-L4 做咩。Pattern 永遠一樣。
def __init__(self):
self.主要store = {} # L1
self.counter = 0 # L1(如果要 generate ID)
self.pending_某嘢 = {} 或 [] # L3(lazy processing)
self.locks = defaultdict(asyncio.Lock) # L5
邊啲擺外面,邊啲擺 entity 入面:Counter → 永遠外面(global sequential ID)Locks → 永遠外面(per entity,L5-6 用)Pending list → 外面或 entity 入面都得其他 data → 擺 entity dict 入面
L1: add_node / remove_node / assign_key(key)→搵最近 node
self.ring = {} # position → node_id
self.sorted_pos = [] # bisect 搵最近
L2: get_keys_for_node / get_node_load sorted
L3: virtual nodes — 每個 node 有 N 個 position
hash(node_id + "_" + str(i))
L4: snapshot + restore(deepcopy)
L5: async batch_assign(keys) → gather + lock
L6: async migrate(keys, max) → semaphore
L1: set(key, value) / get(key) / delete(key)
self.cache = {} # key → value
self.capacity = N
L2: get_all_keys sorted / search_by_prefix
if key.startswith(prefix)
L3: TTL — set_with_ttl(key, value, ttl, timestamp)
alive when ts < expiry
L4: LRU eviction — 滿咗踢最舊嘅
用 OrderedDict 或 list 記 usage order
L5: async batch_get(keys) → gather + lock
L6: async batch_evict(keys, max) → semaphore
L1: set(key, field, value) / get / delete
self.data = {} # key → {field: value}
L2: scan(key) sorted / scan_by_prefix
field.startswith(prefix)
L3: set_at_with_ttl — ts + ttl = expiry
get_at — check ts < expiry
L4: backup(ts) → deepcopy + 記 remaining TTL
restore(ts) → 還原 + 重算 expiry
L5: async batch_operations → gather + lock per key
L6: async batch_scan(keys, max) → semaphore
L1: create_account / deposit / transfer
self.accounts = {} # id → {"balance":0}
L2: top_spenders(n) sorted desc + format
L3: pay → cashback after 86400000ms
_process_cashbacks every method
L4: merge_accounts / get_balance(time_at)
history = [(ts, balance)]
L5: async batch → gather + lock per account
L6: async external_transfer → semaphore
L1: add_room / book_room→return bid / checkout→return name
self.rooms = {} # id → {"guest":"","total_revenue":0}
L2: top_rooms / find_available(min,max) + check occupied
L3: late_checkout→fee after 3600000ms
_process_late_fee check fee>0 唔係 check guest
L4: get_booking_history→return tuple唔係string
upgrade_room→搬guest唔係merge
L5: async batch → gather + lock per room
L6: async send_notifications → semaphore
L1: upload(name, size) / get(name) / copy(src, dest)
self.files = {} # name → {"size":0}
self.capacity = N
L2: search(prefix) sorted / top_n_largest
L3: upload_at_with_ttl / get_at
TTL: alive when ts < expiry
L4: rollback(ts) → deepcopy restore
OR compress/decompress
L5: async batch_upload → gather + lock
L6: async batch_download → semaphore
L1: add_task(id, priority) / get_next_task(FIFO within priority)
self.tasks = {} # id → {"priority":0,"status":"QUEUED"}
L2: start/complete/fail_task + get_tasks_by_status
L3: retry with backoff — fail→RETRY_SCHEDULED
retry_time = ts + base * 2^attempt
process_retries(ts)
L4: dependencies — 所有 deps COMPLETED 先可以 start
L5: async run_workers(n) → gather + lock
L6: async dispatch(ids, max) → semaphore
L1: add_item(id, qty, price) / get_item / remove_item
self.items = {} # id → {"qty":0,"price":0}
L2: search(prefix) / top_items by value(qty*price)
L3: warehouses with capacity limit
assign_item / transfer_item
L4: price_history / snapshot + restore(deepcopy)
L5: async bulk_import → semaphore + lock per item
L6: async process_orders → gather + lock per warehouse
L1: add_package / install / uninstall / get_status
self.packages = {} # id → {"installed":False,"deps":[]}
L2: search(prefix) / list_installed sorted
L3: dependencies — install A auto install B,C
check circular deps
L4: version history / rollback to previous state
L5: async batch_install → gather + lock
L6: async download_packages → semaphore
L1: create_topic / publish / subscribe / consume
self.topics = {} # id → {"messages":[],"subs":set()}
L2: list_topics / get_message_count sorted
L3: TTL — messages expire after N ms
L4: consumer offset tracking / replay from offset
L5: async batch_publish → gather + lock per topic
L6: async fan_out → semaphore
L1: add_server / remove_server / route_request(round-robin)
self.servers = {} # id → {"weight":1,"load":0}
L2: get_server_load sorted / find_least_loaded
L3: health check — timeout→unhealthy, auto recover
L4: sticky session history / failover
L5: async batch_route → gather + lock
L6: async health_check_all → semaphore
L1: add_player / update_score / get_rank
self.players = {} # id → {"score":0}
L2: top_players(n) sorted / search_by_prefix
L3: score decay — 每 N ms 扣分(lazy processing)
L4: season history / rollback to previous season
L5: async batch_update_scores → gather + lock
L6: async sync_external → semaphore
L1: create_alert / acknowledge / get_status
self.alerts = {} # id → {"severity":"warning","acked":False}
L2: list_by_severity sorted / filter_unacknowledged
L3: auto-escalate — N ms 冇 ack→升級 severity
L4: alert history / merge duplicate alerts
L5: async batch_create → gather + lock
L6: async send_notifications → semaphore
L1: create_session(user)→token / get_session / revoke
self.sessions = {} # token → {"user":"","expiry":ts}
L2: list_active sorted / get_sessions_by_user
L3: TTL — session 過期, refresh 續期
L4: session history per user / limit max concurrent
L5: async batch_create → gather + lock per user
L6: async cleanup_expired → semaphore
L1: create_workflow / add_step / start / complete
self.workflows = {} # id → {"steps":[{"status":"pending"}]}
L2: get_progress sorted / list_by_status
L3: auto-trigger — 完成一步自動觸發下一步
L4: rollback — 失敗後回退已完成嘅 steps
L5: async run_steps → gather + lock
L6: async external_trigger → semaphore
L1: add_record(domain, ip) / resolve(domain) / delete
self.records = {} # domain → {"ip":"","ttl":0}
L2: list_records sorted / search wildcard *.example.com
L3: TTL — records 過期
L4: CNAME chains / backup + restore
L5: async batch_resolve → gather + lock
L6: async bulk_update → semaphore
L1: add_user / grant(user, resource, perm) / check_access
self.users = {} # user → set of roles
self.perms = {} # (user, resource) → set of perms
L2: list_users_with_access / get_permissions sorted
L3: time-limited access — grant expires after N ms
L4: role inheritance / audit history
L5: async batch_check → gather + lock
L6: async sync_permissions → semaphore
L1: add_log(ts, level, source, msg) / get_logs(source)
self.logs = [] # [(ts, level, source, msg)]
L2: filter_by_level / count_by_source sorted
L3: retention — 超過 N ms 嘅 logs 自動刪
L4: snapshot + restore / aggregate stats
L5: async batch_ingest → gather + lock per source
L6: async export_logs → semaphore
一個停車場:add 車位、park 泊車、remove 走車、expire 超時自動趕走。同 Hotel 差唔多嘎 pattern,但 expire 係 duration-based 自動到期。
Parking 同 Hotel 嘎核心對照:
Hotel room → Parking spot(車位)
Hotel guest → Parking vehicle(架車)
Hotel book → Parking park_vehicle(泊車)
Hotel checkout → Parking remove_vehicle(走車)
Hotel late_fee → Parking fee(泊車費,按時長計)
Hotel upgrade → (冒,Parking 用 capacity 代替)
Parking 獨有嘎嘢:
vehicle_id == None → available(空車位)
vehicle_id != None → occupied(有車泊緊)
expires_at != None → 有時限,到期要趕走
spot_type = "compact" / "regular" / "large"
max_spots per type → capacity 限制
Parking 係 Hotel 嘎「換皮版」。core pattern 一樣:flat dict + lazy expiry + capacity + batch + sync。
Hotel Parking
─────────────────────── ───────────────────────
add_room add_spot
book_room park_vehicle
checkout remove_vehicle
get_room_info get_spot
late_fee fee(按泊車時長計)
upgrade_room (冒,用 capacity 限制代替)
batch_operations batch_operations(一樣)
send_notifications sync_lots(transfer + fail-fast)
── L1 CRUD ── 🟰 add_spot 同 Hotel add_room 一樣 🟰 park_vehicle 同 Hotel book_room 類似(return True/False) 🟰 remove_vehicle 同 Hotel checkout 類似(return True/False) 🟰 get_spot 同 Hotel get_room_info 一樣 ── L2 Sort ── 🟰 list_spots 同 Hotel list_rooms 一樣(sort_by "id"/"type") 🟰 count_available 計有幾多空位 ⚠️ find_spot 搞第一個指定 type 嘎空位 ── L3 TTL ── ⚠️ park_vehicle_with_duration 沺車 + 自動到期(Hotel 冒呢個) 🟰 _process_expired 同 Hotel _process_late_fee 一樣 lazy pattern ⚠️ get_fee 按泊車時長計費 ── L4 Capacity + History ── ⚠️ set_max_spots 設定每種 type 最多幾多車位可以泊 🟰 get_history 同 Hotel get_booking_history 一樣 ── L5 Batch ── 🟰 batch_operations 同 Hotel L5 一樣(lock per spot_id) ── L6 Sync ── ⚠️ sync_lots transfer 車位 + fail-fast + semaphore + sleep
import asyncio
from collections import defaultdict
class ParkingLotSystem:
def __init__(self):
self.spots = {} # L1 spot_id → spot info dict
self.max_spots = {} # L4 加:spot_type → max count 限制
self.spots_locks = defaultdict(asyncio.Lock) # L5 加:per-spot_id 嘎 async lock
def __init__(self):
self.spots = {} # 主角:spot_id → spot info
self.max_spots = {} # L4 加:type → 最多同時泊幾多架
self.spots_locks = defaultdict(asyncio.Lock) # L5 加:per-spot lock
# 你要記住:
# Parking 唔係好多 dict
# 由頭到尾其實得 1 個主 dict + 2 個 addon state
self.spots = {
"s1": {
"type": "compact", # 呢個位係咩車位
"vehicle_id": "V100", # None = 空位;有值 = 有車泊緊
"parked_at": 1000, # 幾時泊入嚟
"expires_at": 6000, # L3:幾時到鐘要趕走
"history": [ # L4:呢個位發生過咩事
{"event": "park", "vehicle_id": "V100", "ts": 1000, "expires_at": 6000}
]
},
"s2": {
"type": "large",
"vehicle_id": None,
"parked_at": None,
"expires_at": None,
"history": []
}
}
self.max_spots = {"compact": 2, "large": 1}
# self.spots_locks["s1"] 係 Lock object,唔會手寫落 data structure 圖入面
L1:self.spots = {spot_id: {"type", "vehicle_id", "parked_at"}}
L3:每個 spot 多 "expires_at"
L4:每個 spot 多 "history",另外加 self.max_spots
L5:再加 self.spots_locks
L6:唔加新 field;只係 method 入面開 sem 控制並發
一句記法:
spot 自己狀態擺 self.spots
type 上限擺 self.max_spots
並發控制擺 self.spots_locks
lazy = 唔係 background timer,係所有需要見到「最新 active set」嘅 method 先 refresh;最常見係 public method 開頭,但如果 spec 另有明確 cleanup API,就由嗰個 API 觸發。逐個 spot check 有冒車超時 → 超時就趕走(清空 vehicle_id)。
# Helper: _process_expired — lazy 到期處理(每個 public method 開頭都 call)
def _process_expired(self, timestamp): # 掃一次全部車位,趕走超時嘎車
for spot_id, spot in self.spots.items(): # 逐個車位睦
if spot["expires_at"] is not None and spot["vehicle_id"] is not None: # 有車 + 有時限
if timestamp >= spot["expires_at"]: # 到期了?(>=,到期偵秒即趕)
spot["history"].append({"event": "expired", "vehicle_id": spot["vehicle_id"], "ts": timestamp}) # L4 加:記低超時事件
spot["vehicle_id"] = None # 趕走:清空車 ID
spot["parked_at"] = None # 清空泊車時間
spot["expires_at"] = None # 清空到期時間
_process_expired(timestamp)
行一次 self.spots
凡係:
1. vehicle_id is not None(有車)
2. expires_at is not None(有時限)
3. timestamp >= expires_at(到期)
就清走車 + 記 history
每個 public method 第一行都 call 一次(lazy 模式)
Hotel:late_fee 到期 → revenue += fee(收錢)
Parking:expires_at 到期 → 清空 vehicle(趕車走)
Hotel:fee 擺嗺 room dict 入面
Parking:expires_at 擺嗺 spot dict 入面
共通點:都係 lazy(唔係 background timer)
都係每個 public method 開頭 call
都係到期就結算
def add_spot(self, timestamp, spot_id, spot_type): # 加一個新車位
self._process_expired(timestamp) # L3 加:先處理到期嘎車
if spot_id in self.spots: # 呢個車位已經存在?
return False # 已有 → 唔再加
self.spots[spot_id] = { # 建立新車位,記低所有 field
"type": spot_type, # 車位種類(compact/regular/large)
"vehicle_id": None, # None = 空位
"parked_at": None, # 幾時泊嘎(None = 冒車)
"expires_at": None, # L3 加:到期時間
"history": [] # L4 加:事件記錄
}
return True # 加位成功
def park_vehicle(self, timestamp, spot_id, vehicle_id): # 泊車入某個車位
self._process_expired(timestamp) # 先處理到期嘎車
if spot_id not in self.spots: return False # 車位唔存在
spot = self.spots[spot_id] # 撞車位 pointer
if spot["vehicle_id"] is not None: return False # 已有車泊緊 → 唔得
# L4 加:check capacity
spot_type = spot["type"] # 搞呢個位嘎 type
if spot_type in self.max_spots: # 有設定上限?
occupied_count = sum(1 for s in self.spots.values() if s["type"] == spot_type and s["vehicle_id"] is not None) # 數下呢個 type 已泊幾多
if occupied_count >= self.max_spots[spot_type]: return False # 滿咗 → reject
spot["vehicle_id"] = vehicle_id # 記低邊架車泊咗
spot["parked_at"] = timestamp # 記低幾時泊
spot["history"].append({"event": "park", "vehicle_id": vehicle_id, "ts": timestamp}) # L4 加:記事件
return True # 泊車成功
def remove_vehicle(self, timestamp, spot_id): # 走車(手動離場)
self._process_expired(timestamp) # 先處理到期嘎車
if spot_id not in self.spots: return False # 車位唔存在
spot = self.spots[spot_id] # 撞車位 pointer
if spot["vehicle_id"] is None: return False # 冒車 → 冒得走
spot["history"].append({"event": "remove", "vehicle_id": spot["vehicle_id"], "ts": timestamp}) # L4 加:記事件
spot["vehicle_id"] = None # 清空:車走咗
spot["parked_at"] = None # 清空泊車時間
spot["expires_at"] = None # 清空到期時間
return True # 走車成功
def get_spot(self, timestamp, spot_id): # 查某個車位嘎狀態
self._process_expired(timestamp) # 先處理到期嘎車
if spot_id not in self.spots: return None # 車位唔存在 → return None
return dict(self.spots[spot_id]) # return copy,唔好畊外面直接改
def __init__(self):
self.spots = {}
# L1 只需要 1 個主 dict
# 未有 self.max_spots
# 未有 self.spots_locks
self.spots = {
"s1": {
"type": "compact",
"vehicle_id": "V100",
"parked_at": 1
},
"s2": {
"type": "large",
"vehicle_id": None,
"parked_at": None
}
}
# 核心只係:
# spot_id → 呢個位係咩 type,而家有冇車,幾時泊入嚟
"s1": {
"type": "compact",
"vehicle_id": "V100", ← 泊咗
"parked_at": 1,
"expires_at": None, ← 冒時限
"history": [{"event":"park","vehicle_id":"V100","ts":1}]
}
"s1": {
"vehicle_id": None, ← 走咗
"parked_at": None,
"expires_at": None,
"history": [..., {"event":"remove","vehicle_id":"V100","ts":5}]
}
return True
# park_vehicle return True/False
# 唔係 return booking ID
return True ← 成功
return False ← spot 唔存在 / 已有車 / 滿咗
# remove_vehicle return True/False
# 唔係 return vehicle_id
return True ← 成功走車
return False ← spot 唔存在 / 冒車
# get_spot return dict copy 或 None
# 唔好直接 return self.spots[id](畊外面改到)
return dict(spot) ← copy
return None ← 唔存在
def list_spots(self, timestamp, sort_by): # 列出所有車位,可按 "id" 或 "type" 排
self._process_expired(timestamp) # 先處理到期嘎車
if sort_by == "id": # 按車位 ID 排
sorted_items = sorted(self.spots.items(), key=lambda x: x[0]) # spot_id asc
elif sort_by == "type": # 按車位種類排
sorted_items = sorted(self.spots.items(), key=lambda x: (x[1]["type"], x[0])) # type asc,同 type 按 id asc
else: # 唔認識嘎 sort_by
sorted_items = list(self.spots.items()) # 唔排,原樣出
return [{"spot_id": sid, **info} for sid, info in sorted_items] # 每個加埋 spot_id 出去
def count_available(self, timestamp): # 數下有幾多空位
self._process_expired(timestamp) # 先處理到期嘎車
return sum(1 for spot in self.spots.values() if spot["vehicle_id"] is None) # vehicle_id 係 None 即係空位
def find_spot(self, timestamp, spot_type): # 搞第一個指定 type 嘎空位
self._process_expired(timestamp) # 先處理到期嘎車
for spot_id, spot in sorted(self.spots.items()): # 按 spot_id 排序逐個睦
if spot["type"] == spot_type and spot["vehicle_id"] is None: # type 啱 + 冒車
return spot_id # 搞到第一個就 return
return None # 搞唔到 → None
spots = {
"s1": {"type": "large", "vehicle_id": None},
"s2": {"type": "compact", "vehicle_id": "V1"},
"s3": {"type": "compact", "vehicle_id": None}
}
list_spots(1, "type")
→ [
{"spot_id":"s2", "type":"compact", ...},
{"spot_id":"s3", "type":"compact", ...},
{"spot_id":"s1", "type":"large", ...}
]
# compact 排先(c < l),同 type 按 id
→ "s3"
# s2 有車,skip
# s3 空位 + type 啱 → return
# find_spot return spot_id(string)或 None
# 唔係 return spot dict
return "s3" ← spot_id
return None ← 搞唔到
# count_available 唔計有車啵啲
# vehicle_id is None 先計
# 唔好用 len(self.spots)(啵個係總數)
def park_vehicle_with_duration(self, timestamp, spot_id, vehicle_id, duration_ms): # 泊車 + 設時限
self._process_expired(timestamp) # 先處理到期嘎車
if spot_id not in self.spots: return False # 車位唔存在
spot = self.spots[spot_id] # 撞車位 pointer
if spot["vehicle_id"] is not None: return False # 已有車泊緊 → 唔得
# L4 加:check capacity(同 park_vehicle 一樣)
spot_type = spot["type"] # 搞呢個位嘎 type
if spot_type in self.max_spots: # 有設定上限?
occupied_count = sum(1 for s in self.spots.values() if s["type"] == spot_type and s["vehicle_id"] is not None) # 數已泊
if occupied_count >= self.max_spots[spot_type]: return False # 滿咗 → reject
spot["vehicle_id"] = vehicle_id # 記低邊架車泊咗
spot["parked_at"] = timestamp # 記低幾時泊
spot["expires_at"] = timestamp + duration_ms # 設到期時間 = 而家 + 時限
spot["history"].append({"event": "park", "vehicle_id": vehicle_id, "ts": timestamp, "expires_at": timestamp + duration_ms}) # L4 加
return True # 泊車成功(有時限版)
def get_fee(self, timestamp, spot_id): # 計呢個車位而家嘎泊車費
self._process_expired(timestamp) # 先處理到期嘎車
if spot_id not in self.spots: return None # 車位唔存在
spot = self.spots[spot_id] # 撞車位 pointer
if spot["vehicle_id"] is None: return 0 # 冒車泊 → fee = 0
duration = timestamp - spot["parked_at"] # 計已泊幾耐(ms)
fee = duration # 費用 = 泊嘎時長(1ms = 1 蚊,簡單版)
return fee # return 泊車費
"s1": {
"vehicle_id": "V1",
"parked_at": 100,
"expires_at": 5100 ← 100 + 5000
}
# 當 timestamp >= 5100 啵陣
# _process_expired 會自動清走 V1
duration = 300 - 100 = 200
fee = 200
return 200
# fee 係基於實際泊咗幾耐
# 唔係基於 duration_ms(啵個係時限)
# 如果車未到期就走,fee = 實際時長
# expires_at = timestamp + duration_ms
# 唔係 timestamp + fee
# duration_ms 係「最多泊幾耐」
# fee 係「實際泊咗幾耐」
# _process_expired 用 >= 唔係 >
if timestamp >= spot["expires_at"]: ✅
if timestamp > spot["expires_at"]: ❌
# get_fee 車走咗之後 return 0
# 唔好 return 之前嘎 fee
if spot["vehicle_id"] is None: return 0
def set_max_spots(self, timestamp, spot_type, max_count): # 設定某 type 最多幾多車可以泊
self._process_expired(timestamp) # 先處理到期嘎車
self.max_spots[spot_type] = max_count # 直接設:呢個 type 最多 max_count 架車
return True # 設定成功
def get_history(self, timestamp, spot_id): # 撞某車位嘎事件記錄
self._process_expired(timestamp) # 先處理到期嘎車
if spot_id not in self.spots: return None # 車位唔存在
return list(self.spots[spot_id]["history"]) # return copy,唔好畊外面改
def __init__(self):
self.spots = {}
self.max_spots = {}
# 到 L4 先正式有第 2 份 state:
# self.max_spots[type] = max_count
self.spots = {
"s1": {
"type": "compact",
"vehicle_id": "V2",
"parked_at": 7,
"expires_at": 12,
"history": [
{"event":"park", "vehicle_id":"V1", "ts":1},
{"event":"remove", "vehicle_id":"V1", "ts":5},
{"event":"park", "vehicle_id":"V2", "ts":7, "expires_at":12}
]
}
}
self.max_spots = {
"compact": 2,
"large": 1
}
# 即係:
# self.spots 管每個位自身狀態
# self.max_spots 管每種 type 全場最多容納幾多架
self.max_spots = {"compact": 2}
# 之後 park_vehicle / park_vehicle_with_duration
# 會 check:呢個 type 已泊幾多?
# 如果 occupied >= 2 → return False(滿咗)
# 例子:
# s1(compact) 有車, s2(compact) 有車
# park_vehicle(t, "s3", "V3") → s3 係 compact
# occupied = 2 >= max 2 → return False
[
{"event":"park", "vehicle_id":"V1", "ts":1},
{"event":"remove", "vehicle_id":"V1", "ts":5},
{"event":"park", "vehicle_id":"V2", "ts":7, "expires_at":12},
{"event":"expired", "vehicle_id":"V2", "ts":12}
]
# 每次 park / remove / expired 都記一筆
capacity check 計嘎係 occupied(有車啵啲)
唔係 total spots of that type
3 個 compact spot,max=2
只要 occupied < 2 就可以再泊
set_max_spots 唔會趕走已泊嘎車
只影響之後嘎 park 操作
已超標嘎車繼續泊住
同時處理 park / remove。Lock per spot_id。每個 op 鎖一個 spot。
# Parking L5 只改一個 spot per op(唔似 Hotel 有 upgrade 要鎖兩個)
# 所以每個 op 只需要鎖一個 spot_id
async def batch_operations(self, timestamp, ops): # 一次過處理成批泊車/走車操作
self._process_expired(timestamp) # 開工前先清走到期嘎車
# 第一步:定義點樣處理單一 operation
async def execute_op(op): # 每個操作最後都行到呢個 helper
sid = op["spot_id"] # 搞出呢個操作涉及邊個車位
async with self.spots_locks[sid]: # 鎖住呢個車位,同一時間只得一個 op 入得去
if op["type"] == "park": # 泊車操作
return self.park_vehicle(timestamp, op["spot_id"], op["vehicle_id"]) # call 返原本嘎 park method
elif op["type"] == "remove": # 走車操作
return self.remove_vehicle(timestamp, op["spot_id"]) # call 返原本嘎 remove method
elif op["type"] == "park_with_duration": # 有時限泊車
return self.park_vehicle_with_duration(timestamp, op["spot_id"], op["vehicle_id"], op["duration_ms"]) # 有到期版
# 第二步:收集所有操作,一次過 gather
tasks = [] # 收集待處理操作
for op in ops: # 逐個操作放入
tasks.append(execute_op(op)) # 每個變成 coroutine
results = await asyncio.gather(*tasks) # 唔同 spot 可以並行;同 spot 因 lock 排隊
return list(results) # 按原次序交返結果
def __init__(self):
self.spots = {}
self.max_spots = {}
self.spots_locks = defaultdict(asyncio.Lock)
# spots / max_spots 係資料
# spots_locks 唔係資料,係保護資料用
資料本身仲係:
self.spots = {
"s1": {"type":"compact", "vehicle_id":"V1", "parked_at":1, "expires_at":None, "history":[]},
"s2": {"type":"large", "vehicle_id":None, "parked_at":None, "expires_at":None, "history":[]}
}
self.max_spots = {"compact": 2}
另外 runtime 會有:
self.spots_locks["s1"] = <Lock>
self.spots_locks["s2"] = <Lock>
# 右手要分清:
# data structure 係 spots / max_spots
# concurrency state 係 spots_locks
1. lock per SPOT_ID
2. async with lock: 入面 call sync method
3. asyncio.gather(*[...]) 同時跑
4. return list(results)
Parking 唔需要鎖兩個 key
因為冒 upgrade/transfer op
每個 op 只涉及一個 spot
ops = [
{"type":"park", "spot_id":"s1", "vehicle_id":"V1"},
{"type":"park", "spot_id":"s2", "vehicle_id":"V2"},
{"type":"remove", "spot_id":"s3"}
]
# s1, s2, s3 唔同 → 三個並行
# 如果兩個都係 s1 → 排隊(lock)
→ [True, True, True]
跨停車場 transfer 車位。Fail-fast:check 嗺 sem 前,失敗唔 sleep。成功先入 semaphore + sleep。
async def sync_lots(self, timestamp, transfers, max_concurrent): # 並發處理跨場 transfer;假單即踢,真單先入限流
self._process_expired(timestamp) # 先清走到期嘎車
sem = asyncio.Semaphore(max_concurrent) # 好似閘口同時得 N 架車可以 transfer
# 第一步:定義單一 transfer 點樣驗身同執行
async def do_transfer(t): # 每個 transfer 都經過呢個 helper
sid = t["spot_id"] # 涉及邊個車位
lock = self.spots_locks[sid] # 撞呢個車位專屬鎖
# 第二步:先嗺 lock 入面做 fail-fast 驗證
async with lock: # 核對清楚先;唔啱就唔好浪費 transfer quota
if sid not in self.spots: # 車位唔存在
return False # 即走,唔 sleep
if self.spots[sid]["vehicle_id"] is None: # 冒車可以 transfer
return False # 即走,唔 sleep
if self.spots[sid]["vehicle_id"] != t["vehicle_id"]: # 車 ID 唔 match
return False # 即走,唔 sleep
# 第三步:驗身成功先入 semaphore,模擬 transfer
async with sem: # 合格先撞到 transfer 位;控制並發數
await asyncio.sleep(0.01) # 模擬跨場 transfer 要花時間
# 第四步:transfer 完成,清走原車位嘎車
async with lock: # 再鎖一次,改狀態
spot = self.spots[sid] # 撞 pointer
spot["history"].append({"event": "transfer", "vehicle_id": spot["vehicle_id"], "ts": timestamp}) # 記 transfer 事件
spot["vehicle_id"] = None # 車走咗(已 transfer 去另一個場)
spot["parked_at"] = None # 清空
spot["expires_at"] = None # 清空
return True # transfer 成功
# 第五步:收集所有 transfer,一次過 gather
tasks = [] # 收集待處理 transfer
for t in transfers: # 逐個 transfer 放入
tasks.append(do_transfer(t)) # 每個變成 coroutine
results = await asyncio.gather(*tasks) # gather 等晒所有 transfer 完
return list(results) # 按輸入次序交返 True/False
def __init__(self):
self.spots = {}
self.max_spots = {}
self.spots_locks = defaultdict(asyncio.Lock)
# L6 冇 self.sem 呢啲 state
# sem 係 method 入面臨時開:
# sem = asyncio.Semaphore(max_concurrent)
self.spots
= 真資料(有咩位、有冇車、幾時到期、history)
self.max_spots
= business rule(每種 type 最多幾多架)
self.spots_locks
= per-spot lock,避免同一個位俾兩個 coroutine 一齊改
sem
= sync_lots() 入面臨時開嘅閘口
控制「同時幾多單 transfer 真正去 sleep」
失敗嘎 transfer 唔 acquire sem,即刻 return False
成功嘎先撞 sem → sleep → 改狀態 → return True
3 個 fail 條件(任一 = False):
1. spot 唔存在
2. spot 冒車(vehicle_id is None)
3. vehicle_id 唔 match
流程:
lock → check → fail? return False(唔 sleep)
pass? → sem → sleep → lock → 改狀態 → True
transfers = [
{"spot_id":"s1","vehicle_id":"V1"}, ← s1 有 V1 → pass
{"spot_id":"s9","vehicle_id":"V9"}, ← s9 唔存在 → fail-fast
{"spot_id":"s1","vehicle_id":"V2"}, ← s1 車係 V1 唔係 V2 → fail-fast
]
max_concurrent = 1
→ [True, False, False]
# s9 同 V2 即刻 return False(0 秒)
# 只有 V1 真正 sleep(0.01 秒)
# transfer 完後 s1 變空位
Parking L6 = fail-fast:
check 嗺 sem 之前
失敗嘎唔 sleep → 唔佔 sem 位
NF L6 = all-sleep:
全部入 sem → sleep → 先 check
失敗嘎都 sleep → 佔 sem 位
L6 考 async + Semaphore。睇到 spec 先判斷係邊種 pattern,再 copy skeleton。
async def sync_replicas(self, timestamp, requests, max_concurrent): # 並發 sync 去其他舖(fail-fast)
sem = asyncio.Semaphore(max_concurrent) # 准考證 N 張
async def do_one(req): # 處理單一 item
source = req["source"] # 攞 source
dest = req["destination"] # 攞 dest
if source not in self.servers: # Check 1:source 存在?
return False # 唔存在 → 即走,唔入 sem 唔 sleep
if dest not in self.servers: # Check 2:dest 存在?
return False # 唔存在 → 即走
async with sem: # 兩個 check 都過 → 入 sem
await asyncio.sleep(0.01) # 模擬 network latency
return True # 成功就返 True;caller 可以當今次動作真係做咗
tasks = [] # 先開個空 list,等陣逐項放結果或工作入去
for req in requests: # 逐項巡過去;每次處理一個元素
tasks.append(do_one(req)) # 將呢項塞入 list,留待之後一齊處理或回傳
results = await asyncio.gather(*tasks) # 全部一齊跑,等做完
return list(results) # gather 返嚟嗰批結果轉成普通 list;caller 之後比較易直接用
Check 喺 sem 外面做。唔合格嘅唔入 sem 唔 sleep,即走 return False
合格先 async with sem → sleep → return True
Return list[bool]
用喺:Hashring, ChatRoute, DNS, Session, Notification
async def dispatch_tasks(self, timestamp, task_ids, external_call, max_concurrent): # 批量 dispatch(semaphore + fail-fast)
sem = asyncio.Semaphore(max_concurrent) # 准考證 N 張
results = [False] * len(task_ids) # 預先填 False
async def worker(index, tid): # worker coroutine — 不斷攞 task 做
if tid not in self.tasks: # Fail-fast:task 唔存在
return # 即走
if self.tasks[tid]["status"] != "COMPLETED": # Fail-fast:唔係 COMPLETED
return # 即走
async with sem: # 合格先入 sem
await external_call(tid) # 調外部 service
self.tasks[tid]["status"] = "DISPATCHED" # 將新值寫落呢格 state;等於而家正式更新咗紀錄
results[index] = True # 成功就返 True;caller 可以當今次動作真係做咗
tasks = [] # 先開個空 list,等陣逐項放結果或工作入去
for i, tid in enumerate(task_ids): # 逐個行(i, tid)
tasks.append(worker(i, tid)) # 將呢項塞入 list,留待之後一齊處理或回傳
await asyncio.gather(*tasks) # 全部一齊跑
return results # 返 results
預先填 results = [False] * N,成功先改 True
Fail-fast check 喺 sem 外面(同 Pattern 1 一樣)
sem 入面做真正嘅工作(external_call + 改 status)
Return list[bool]
用喺:TaskQueue
async def execute_steps(self, workflow_id, step_ids, external_call, max_concurrent): # 並發做 step lifecycle(sem + lock)
semaphore = asyncio.Semaphore(max_concurrent) # 准考證 N 張
async def run_one(step_id): # 處理單一 item
async with semaphore: # 入 sem
key = (workflow_id, step_id) # 砌 key
if key not in self.step_status: # Fail-fast
return "skipped:" + step_id # 將計好嘅值交返 caller;之後外面會靠呢個結果再判斷
if self.step_status[key] != "READY": # 呢度係分流位;條件唔同就會走去唔同分支
return "skipped:" + step_id # 將計好嘅值交返 caller;之後外面會靠呢個結果再判斷
try: # 包 try/except
lock = self.locks[workflow_id] # 攞鎖 reference
async with lock: # 入鎖(短)
start_result = self.start_step(workflow_id, step_id) # 攞 self.start_step
if start_result != "started": # start 失敗
return "skipped:" + step_id # 將計好嘅值交返 caller;之後外面會靠呢個結果再判斷
await external_call(workflow_id, step_id) # 出鎖做(可能慢)
async with lock: # 再入鎖(短)
self.complete_step(workflow_id, step_id) # 真正將 step 標記完成;等依賴佢嘅後續步驟可以鬆綁
return "executed:" + step_id # 成功就返 True;caller 可以當今次動作真係做咗
except Exception as e: # 爆咗
return "error:" + step_id + ":" + str(e) # 將計好嘅值交返 caller;之後外面會靠呢個結果再判斷
tasks = [] # 先開個空 list,等陣逐項放結果或工作入去
for step_id in step_ids: # 逐項巡過去;每次處理一個元素
tasks.append(run_one(step_id)) # 將呢項塞入 list,留待之後一齊處理或回傳
results = await asyncio.gather(*tasks) # 全部一齊跑,等做完
return list(results) # gather 返嚟嗰批結果轉成普通 list;caller 之後比較易直接用
每個 item 有 3 步:lock(start) → external_call(出鎖) → lock(complete)
點解出鎖做 external_call:可能慢,鎖住嘅話其他 step 要等
try/except 包住 external_call:爆咗 return "error:id:msg",唔影響其他
Return list[str]:3 種值 "executed:id" / "skipped:id" / "error:id:msg"
用喺:Workflow, PackageMgr
Pattern 1 Fail-Fast: check → sem → sleep → True/False
Pattern 2 Worker Pool: check → sem → external_call + 改 status → True/False
Pattern 3 Lifecycle: check → sem → lock(start) → external_call → lock(complete) → 3 種 string
共同 skeleton(每個都有):
sem = asyncio.Semaphore(max_concurrent)
async def do_one(...):
...
tasks = []
for item in items:
tasks.append(do_one(item))
results = await asyncio.gather(*tasks)
return list(results)
想像你寫一個 Gym 會員 check-in system mock。每個 member 有 member_id 同 name。要寫個 class 模擬註冊、入場、離場、排序、自動離場、容量限制、async batch。
想像一間 gym:
┌──────────────────────────────────────────┐
│ member_id="m1" name="Alice" checked_in │
│ member_id="m2" name="Bob" checked_out │
│ member_id="m3" name="Carol" checked_in │
│ expires_at=7000 │
└──────────────────────────────────────────┘
每個 member 有:
member_id = 會員編號(unique key)
name = 會員名
checked_in = 而家喺唔喺 gym 入面
checkins = 總共入場幾多次
expires_at = 幾時自動 check out(None = 唔會自動走)
規則:
1. member_id 唔可以重複(register 之前要 check)
2. check_in 要先 register 過(未註冊 → False)
3. 入場時限到鐘 → 自動 check out(lazy purge,唔係刪 member)
4. gym 有 capacity limit — 滿咗就拒絕入場
# 例:上面間 gym 查一啲嘢
get_member_status(t, "m1") → "checked_in"
get_member_status(t, "m2") → "checked_out"
get_member_status(t, "m99") → "not_found"
get_active_count(t) → 2(Alice + Carol)
list_members(t, "name") → "Alice(1), Bob(0), Carol(1)"
list_members(t, "checkins") → "Alice(1), Carol(1), Bob(0)"
# 後面 level 加多啲嘢:
# L2 加 sort(list_members)+ count(get_active_count)
# L3 加 TTL(check_in_with_duration, lazy _process_expired)
# L4 加 capacity limit(set_capacity, get_capacity)
# L5 加 async batch_operations(per-member lock)
# L6 加 sync_members(rate-limited,semaphore)
import asyncio
from collections import defaultdict
class GymSystem:
def __init__(self):
self.members = {} # L1 所有會員(member_id → info dict)
self.capacity = -1 # L4 加:gym 最大人數(-1 = 無限)
self.locks = defaultdict(asyncio.Lock) # L5 加:per-member 嘅 async lock
self.members = {
"m1": {"name": "Alice", "checked_in": True, "checkins": 1, "expires_at": None},
"m2": {"name": "Bob", "checked_in": False, "checkins": 0, "expires_at": None},
}
# 第一層 key = member_id("m1")
# 第二層係個 dict,存呢個 member 嘅 info
member_id │ name │ checked_in │ checkins │ expires_at
────────────┼────────┼────────────┼──────────┼────────────
m1 │ Alice │ True │ 1 │ None
m2 │ Bob │ False │ 0 │ None
m3 │ Carol │ True │ 1 │ 7000
L1:name, checked_in, checkins # 最基本
L2:(冇加新 field,只係讀 checkins + name)
L3:expires_at # None = 唔會自動走;int = 幾時自動 checkout
L4:self.capacity # init 時加(-1 = 無限)
L5:self.locks # init 時加 defaultdict(asyncio.Lock)
L6:(冇加新 field,semaphore 喺 method 入面開)
# Helper: _process_expired — lazy TTL 到鐘就幫會員自動離場(每個 public method 開頭都 call)
def _process_expired(self, timestamp): # 唔係定時 task,係 lazy 模式
for mid, info in self.members.items(): # 逐個 member 睇
exp = info["expires_at"] # 攞 expires_at(可能係 None)
if exp is None: # None = 唔會自動走
continue # 跳過唔睇
if timestamp >= exp: # 到鐘就當今次入場時間用完;要幫會員自動離場
info["checked_in"] = False # 自動 check out(唔係 del member,只係離場)
info["expires_at"] = None # 清走 TTL(已經處理完)
_process_expired(timestamp)
行一次 self.members
凡係 expires_at 不為 None 且 timestamp >= expires_at
就將佢 checked_in 改做 False + expires_at 改 None
注意:唔係 del member!只係自動離場
每個 public method 第一行都 call 一次(lazy 模式)
FS/Bank 嘅 TTL 過期 → del 走個 entry(file/account 消失)
Gym 嘅 TTL 過期 → 只係 check out(member 仲喺 system 入面)
所以 Gym 唔需要 expired = [] 先收集再 del
直接改 field 就得(唔會 modify dict size,safe to iterate)
register = 註冊新會員 check_in = 入場 check_out = 離場 get_member_status = 查狀態
def register_member(self, timestamp, member_id, name): # 註冊新會員
self._process_expired(timestamp) # 開頭先清過期(公定模式)
if member_id in self.members: # 重複 member_id → 拒收
return False # 約定 return False
self.members[member_id] = { # 開一格新 member
"name": name, # 記低名
"checked_in": False, # 初始狀態:未入場
"checkins": 0, # 入場次數由 0 開始
"expires_at": None, # 冇 TTL = None(L3 嗰個 method 先會 set 數字)
}
return True # 註冊成功
def check_in(self, timestamp, member_id): # 會員入場
self._process_expired(timestamp) # 開頭先清過期
if member_id not in self.members: # 未註冊
return False # 唔畀入
info = self.members[member_id] # 攞呢個 member 嘅 info
if info["checked_in"]: # 已經喺入面
return False # 唔好重複 check in
info["checked_in"] = True # 入場
info["checkins"] += 1 # 入場次數 +1
return True # 入場成功
def check_out(self, timestamp, member_id): # 會員離場
self._process_expired(timestamp) # 開頭先清過期
if member_id not in self.members: # 未註冊
return False # 查無此人
info = self.members[member_id] # 攞 info
if not info["checked_in"]: # 本身唔喺入面
return False # 冇得 check out
info["checked_in"] = False # 離場
info["expires_at"] = None # 清走 TTL(手動離場取消自動離場)
return True # 離場成功
def get_member_status(self, timestamp, member_id): # 查會員狀態
self._process_expired(timestamp) # 開頭先清過期(可能自動 checkout 咗)
if member_id not in self.members: # 唔存在
return "not_found" # 查無此人
info = self.members[member_id] # 攞 info
if info["checked_in"]: # 喺入面
return "checked_in" # 而家喺 gym
return "checked_out" # 唔喺入面(但係有註冊)
def __init__(self):
self.members = {}
self.members = {
"m1": {
"name": "Alice",
"checked_in": False,
"checkins": 0,
"expires_at": None,
},
}
# expires_at 預設一律 None(L3 嗰個 check_in_with_duration 先會 set 數字)
# checked_in 係 bool(True = 喺 gym,False = 唔喺)
# checkins 係 int(每次 check_in 加 1,check_out 唔減)
_process_expired(timestamp)
L1 入面所有 method 第一行都 call
L1 自己唔會產生 expired session(check_in 一律 expires_at=None)
但係要養成習慣,方便 L3 一加 TTL 就有效
list_members = 列晒所有會員 sort_by = "name" 或 "checkins" get_active_count = 而家幾多人喺入面
def list_members(self, timestamp, sort_by): # 列晒所有會員,按 name 或 checkins 排
self._process_expired(timestamp) # 開頭先清過期
items = [] # 暫存所有 (name, checkins) tuple
for mid, info in self.members.items(): # 逐個 member 攞出嚟
items.append((info["name"], info["checkins"])) # 砌做 tuple
if sort_by == "checkins": # checkins 模式
items.sort(key=lambda x: (-x[1], x[0])) # checkins desc,tie 用 name asc
else: # 預設 name 模式
items.sort(key=lambda x: x[0]) # 純 name asc(字母順序)
parts = [] # 砌 output 字串
for name, count in items: # 逐個轉做 "name(count)"
parts.append(name + "(" + str(count) + ")") # 砌單個 entry
return ", ".join(parts) # 用 ", " 連埋一齊
def get_active_count(self, timestamp): # 而家幾多人喺 gym 入面
self._process_expired(timestamp) # 開頭先清過期(過期嘅自動 checkout 咗)
count = 0 # 由 0 開始數
for mid, info in self.members.items(): # 逐個 member 睇
if info["checked_in"]: # checked_in 係 True
count += 1 # 計一個
return count # 返總數
def __init__(self):
self.members = {}
# 同 L1 一樣,冇加新 field
self.members = {
"m1": {"name": "Alice", "checked_in": True, "checkins": 3, "expires_at": None},
"m2": {"name": "Bob", "checked_in": False, "checkins": 1, "expires_at": None},
"m3": {"name": "Carol", "checked_in": True, "checkins": 3, "expires_at": None},
}
# items.sort(key=lambda x: x[0])
# x 就係 tuple,x[0] 就係 name 字串
# 按 name 字母升序排
items = [
("Alice", 3), # A 排先
("Bob", 1), # B 排第二
("Carol", 3), # C 排第三
]
→ "Alice(3), Bob(1), Carol(3)"
# items.sort(key=lambda x: (-x[1], x[0]))
# -x[1] = -checkins(大嘅排先)
# x[0] = name(tie-break 用字母升序)
items = [
("Alice", 3), # -3, "Alice" ← 最細(A 排先)
("Carol", 3), # -3, "Carol" ← 同 -3,C 排後
("Bob", 1), # -1 ← checkins 少排尾
]
→ "Alice(3), Carol(3), Bob(1)"
_process_expired(timestamp)
list_members 同 get_active_count 開頭都要 call
過期 session 應該先 auto-checkout,再計 active count
TTL = time to live duration_ms = session 幾耐之後自動 checkout expires_at = 過期嘅絕對 timestamp lazy = 用嗰陣先 check
def check_in_with_duration(self, timestamp, member_id, duration_ms): # 入場 + 設定自動離場時間
self._process_expired(timestamp) # 開頭先清過期(可能 member 啱啱 auto-checkout,可以重新入場)
if member_id not in self.members: # 未註冊
return False # 唔畀入
info = self.members[member_id] # 攞 info
if info["checked_in"]: # 已經喺入面
return False # 唔好重複 check in
info["checked_in"] = True # 入場
info["checkins"] += 1 # 入場次數 +1
info["expires_at"] = timestamp + duration_ms # 設定幾時自動離場
return True # 入場成功(有自動離場 timer)
def __init__(self):
self.members = {}
# 仲係冇加 instance var,TTL 資訊放入 member dict 入面
self.members = {
"m1": {"name": "Alice", "checked_in": True, "checkins": 1, "expires_at": None},
"m3": {"name": "Carol", "checked_in": True, "checkins": 1, "expires_at": 7000},
}
# expires_at 兩種值:
# None → 唔會自動走(check_in 普通版)
# int (ms) → timestamp >= 呢個值就自動 checkout(check_in_with_duration)
新加:
check_in_with_duration(timestamp, member_id, duration_ms)
無改任何 L1/L2 method(佢哋已經 call _process_expired 喺第一行)
即係 L3 主要係靠 helper + 一個新 method 完成
同 FS 嘅 add_file_with_ttl 一樣嘅 pattern:原有 method 唔改,加一個帶 TTL 嘅新 method,靠 helper 做 lazy cleanup。
_process_expired(timestamp)
L3 真正用得着佢,凡 expires_at 不為 None 且 timestamp 到位
就 auto-checkout(checked_in = False, expires_at = None)
capacity = gym 最大人數 -1 = 無限 滿咗就拒絕入場(return False)
def set_capacity(self, timestamp, max_members): # 設定 gym 最大容量
self._process_expired(timestamp) # 開頭先清過期
self.capacity = max_members # 直接 set(-1 = 無限,正整數 = 上限)
return True # 設定成功
def get_capacity(self, timestamp): # 查 gym 容量
self._process_expired(timestamp) # 開頭先清過期
return self.capacity # 返 -1 或正整數
def __init__(self):
self.members = {}
self.capacity = -1
self.capacity = -1 # 無限
self.capacity = 50 # 最多 50 人
# capacity 唔放入 member dict 入面
# 係 gym-wide setting,屬於 self 頂層
要改:
check_in(timestamp, member_id) — 加 capacity check
check_in_with_duration(timestamp, member_id, duration_ms) — 加 capacity check
新加:
set_capacity(timestamp, max_members)
get_capacity(timestamp)
_process_expired(timestamp)
set_capacity / get_capacity 開頭都 call
先清走過期 session,再計 active count,先準確判斷有冇位
check_in 同 check_in_with_duration 要加 capacity check — 改動如下:
# ↓ check_in — L4 版(加咗 capacity check)
def check_in(self, timestamp, member_id): # 會員入場(有容量限制)
self._process_expired(timestamp) # 開頭先清過期
if member_id not in self.members: # 未註冊
return False # 唔畀入
info = self.members[member_id] # 攞 info
if info["checked_in"]: # 已經喺入面
return False # 唔好重複
if self.capacity != -1: # 有容量限制(-1 = 無限,跳過)
active = self.get_active_count(timestamp) # 數而家幾多人
if active >= self.capacity: # 滿咗
return False # 拒絕入場(gym is full)
info["checked_in"] = True # 入場
info["checkins"] += 1 # 入場次數 +1
return True # 成功就返 True;caller 可以當今次動作真係做咗
# ↓ check_in_with_duration — L4 版(加咗 capacity check)
def check_in_with_duration(self, timestamp, member_id, duration_ms): # 入場 + TTL(有容量限制)
self._process_expired(timestamp) # 開頭先清過期
if member_id not in self.members: # 未註冊
return False # 唔畀入
info = self.members[member_id] # 攞 info
if info["checked_in"]: # 已經喺入面
return False # 唔好重複
if self.capacity != -1: # 有容量限制
active = self.get_active_count(timestamp) # 數而家幾多人
if active >= self.capacity: # 滿咗
return False # 拒絕入場
info["checked_in"] = True # 入場
info["checkins"] += 1 # 入場次數 +1
info["expires_at"] = timestamp + duration_ms # 設定幾時自動離場
return True # 成功就返 True;caller 可以當今次動作真係做咗
插入喺 "已經 checked_in → False" 之後、真正入場之前
即係過咗所有 "唔應該入場" 嘅 check 之後,最後一關
順序:
1. _process_expired ← 清過期
2. member 存在? ← 基本 check
3. 已經 checked_in? ← 唔好重複
4. capacity 滿咗? ← L4 新加
5. 真正入場 ← 改 field
check_in 一入嚟已經 call 過 _process_expired
get_active_count 入面又 call 一次
但第二次 call 時所有過期嘅已經清咗,所以 loop 行一圈乜都唔做
冇副作用,只係浪費少少 CPU,唔影響正確性
batch = 一拼做幾單嘢 lock = 鎖 per-member lock = 每個 member_id 一把鎖
async def batch_operations(self, timestamp, ops): # 一次過做一堆 register/check_in/check_out
results = [] # 暫存每個 op 嘅 True/False 結果
for op in ops: # 順住 input 順序逐個做
op_type = op["type"] # 攞 op 類型
mid = op["member_id"] # 攞 member_id
if op_type == "register": # register 類型
name = op["name"] # 攞 name
async with self.locks[mid]: # 鎖呢個 member_id
ok = self.register_member(timestamp, mid, name) # 走返 L1 嘅 register
results.append(ok) # 記返結果
elif op_type == "check_in": # check_in 類型
async with self.locks[mid]: # 鎖呢個 member_id
ok = self.check_in(timestamp, mid) # 走返 L1 嘅 check_in
results.append(ok) # 記返結果
elif op_type == "check_out": # check_out 類型
async with self.locks[mid]: # 鎖呢個 member_id
ok = self.check_out(timestamp, mid) # 走返 L1 嘅 check_out
results.append(ok) # 記返結果
else: # 其他 type 唔 support
results.append(False) # 一律 False
return results # 返一個同 input 一樣長嘅 list
def __init__(self):
self.members = {}
self.capacity = -1
self.locks = defaultdict(asyncio.Lock)
self.members = {
"m1": {"name": "Alice", "checked_in": True, "checkins": 1, "expires_at": None},
}
self.locks = {
"m1": <asyncio.Lock>, # defaultdict 一 access 就自動造
"m2": <asyncio.Lock>,
}
# 每個 member_id 一把獨立鎖
# 兩個 op 鎖唔同 member → 可以並行
# 兩個 op 鎖同一個 member → 後嗰個會等
新加:
batch_operations(timestamp, ops) ← async
init 多咗:
self.locks = defaultdict(asyncio.Lock)
無改 L1/L2/L3/L4 嘅 sync method(batch 入面 call 返佢哋)
Gym 嘅 L5 比 FS 簡單:每個 op 只涉及一個 member_id,唔需要 sorted double-lock(FS 嘅 copy 涉及 source + dest 兩個 path)。
_process_expired(timestamp)
間接 call(register / check_in / check_out 第一行都 call)
無額外 helper
sync = 同步轉場 semaphore = 信號燈(限制同時做嘅 transfer 數量) fail-fast = 一發現條件唔啱即刻 fail,唔等 semaphore
async def sync_members(self, timestamp, transfers, max_concurrent): # 並行做一堆 transfer,限 N 個 concurrent
self._process_expired(timestamp) # 開頭先清過期
sem = asyncio.Semaphore(max_concurrent) # 開一個 N 位嘅 semaphore(同時最多 N 個)
tasks = [] # 暫存所有 coroutine task
for transfer in transfers: # 逐個 transfer 包做一個 task
task = self._do_one_transfer(timestamp, transfer, sem) # 起 coroutine(未 await)
tasks.append(task) # 入 list
results = await asyncio.gather(*tasks) # 並發跑,等全部完,保留順序
final = [] # 轉做正常 list
for r in results: # 逐個 copy 過
final.append(r) # 入 list
return final # 返一個同 transfers 一樣長嘅 list[bool]
async def _do_one_transfer(self, timestamp, transfer, sem): # 做單一 transfer(async helper)
mid = transfer["member_id"] # 攞 member_id
dest = transfer["destination"] # 攞目的地(另一間 gym)
# fail-fast:未攞 semaphore 之前已經 check(唔阻住其他 task)
if mid not in self.members: # member 唔存在
return False # 即刻 False,唔 acquire semaphore
info = self.members[mid] # 攞 member info
if not info["checked_in"]: # 唔喺入面(冇得轉場)
return False # 即刻 False,唔 acquire semaphore
async with sem: # 過咗 fail-fast 先攞 semaphore(限速)
await asyncio.sleep(0.01) # 模擬 transfer 嘅延遲(10ms)
info["checked_in"] = False # 離場(轉去另一間)
info["expires_at"] = None # 清走 TTL
return True # transfer 成功
def __init__(self):
self.members = {}
self.capacity = -1
self.locks = defaultdict(asyncio.Lock)
# 同 L5 一樣,semaphore 喺 method 入面開(per-call)
self.members = {
"m1": {"name": "Alice", "checked_in": True, "checkins": 5, "expires_at": None},
}
# 同前完全一樣
# semaphore 唔放入 self(每次 sync_members 都重新開一個 N 位)
新加:
_do_one_transfer(timestamp, transfer, sem) ← async helper
sync_members(timestamp, transfers, max_concurrent) ← async public
特別注意:
fail-fast check 寫喺 acquire sem 之前
唔好 acquire 咗 sem 先 check,否則「失敗嘅 task」都會白佔位
transfer 成功後要 check out member(離開呢間 gym)
_process_expired(timestamp)
sync_members 開頭 call 一次
_do_one_transfer(timestamp, transfer, sem)
本 level 自家嘅 async helper,包住 fail-fast + semaphore + sleep + checkout
同 FS、Bank、Hashring 一樣嘅 pattern。check 喺 sem 外面做,合格先入 sem + sleep,唔合格即走。
想像你寫一個 Tetris 棋盤 mock。有一塊 2D grid(board),可以放積木、移動、消行、跌落、undo。每塊積木有 shape(佔邊幾格)、位置(row, col)。
想像一塊 Tetris 棋盤:
┌──────────────────────────────────────────┐
│ 0 0 1 1 0 │ 第 0 行 │
│ 0 0 0 0 0 │ 第 1 行 │
│ 1 1 1 1 1 │ 第 2 行 ← 成行滿咗 │
│ 0 0 0 1 0 │ 第 3 行 │
└──────────────────────────────────────────┘
每塊積木有:
piece_id = 積木編號(自動遞增 "p1", "p2"...)
shape = list of (r, c) offset(積木佔嘅相對格仔��
row, col = 而家積木左上角嘅位置
規則:
1. 放積木之前要 check collision(格仔有冇人、出唔出界)
2. 移動方向 = "left"/"right"/"down",每次移一格
3. 成行滿晒 → 消行(上面嘅行跌落嚟填位)
4. drop = 直接跌到最低有效位置
5. undo = rollback 最後一個動作
# 例:上面塊棋盤查一啲嘢
get_board_state(t) → 返成個 grid(2D list)
get_piece_count(t) → 2(而家棋盤上有 2 塊積木)
list_pieces(t, "row") → "p2(3), p1(0)"(由低排到高)
get_score(t) → 1(已經消咗 1 行)
# 後面 level 加多啲嘢:
# L1 place_piece / remove_piece / get_board_state
# L2 move_piece / get_piece_count / list_pieces
# L3 _process_line_clears / get_score(lazy 消行)
# L4 drop_piece / get_history / undo_last
# L5 async batch_operations(per-piece lock)
# L6 sync_boards(rate-limited,semaphore)
import asyncio
from collections import defaultdict
class TetrisSystem:
def __init__(self, rows, cols):
self.rows = rows # 棋盤有幾多行
self.cols = cols # 棋盤有幾多列
self.board = [] # 棋盤本體(2D list,0=空格 1=有積木)
for r in range(rows): # 逐行造
row = [] # 一行嘅格仔
for c in range(cols): # 逐格填 0
row.append(0) # 0 = 空格
self.board.append(row) # 入落棋盤
self.pieces = {} # L1 所有積木(piece_id → info dict)
self.piece_counter = 0 # L1 自動遞增(下一塊積木嘅編號)
self.score = 0 # L3 加:消咗幾多行
self.history = [] # L4 加:所有動作記錄
self.locks = defaultdict(asyncio.Lock) # L5 加:per-piece 嘅 async lock
self.board = [ # 棋盤(2D list,每格 0=空 1=有嘢)
[0, 0, 1, 1, 0], # 第 0 行
[0, 0, 0, 0, 0], # 第 1 行
[1, 1, 1, 1, 1], # 第 2 行 ← 成行滿咗,會被消
[0, 0, 0, 1, 0], # 第 3 行
]
self.pieces = { # 積木記錄(piece_id → 資料)
"p1": {
"shape": [(0,0),(0,1)], # 佔嘅格仔(row, col offset)
"row": 0, # 而家喺第幾行(左上角)
"col": 2, # 而家喺第幾列(左上角)
},
"p2": {
"shape": [(0,0)], # 單格積木
"row": 3, # 第 3 行
"col": 3, # 第 3 列
},
}
L1:board, pieces, piece_counter # 最基本
L2:(冇加新 field,只係讀 pieces)
L3:score # 消咗幾多行
L4:history # 動作記錄 list
L5:self.locks # defaultdict(asyncio.Lock)
L6:(冇加新 field,semaphore 喺 method 入面開)
# Helper: _is_valid — check 積木放喺某個位置有冇 collision
def _is_valid(self, shape, row, col, exclude_piece_id=None): # 檢查呢個位置得唔得
for dr, dc in shape: # 逐個 offset 睇
r = row + dr # 實際行數
c = col + dc # 實際列數
if r < 0 or r >= self.rows: # 超出上下界
return False # 出界 → 唔得
if c < 0 or c >= self.cols: # 超出左右界
return False # 出界 → 唔得
if self.board[r][c] != 0: # 個格已經有積木
if exclude_piece_id is not None: # 如果係移動(排除自己)
pass # 下面再 check 係咪自己嘅格
else: # 放新積木,唔可以撞
return False # collision → 唔得
return True # 全部 offset 都冇問題 → OK
# Helper: _stamp — 將積木寫入棋盤(填 1)
def _stamp(self, shape, row, col): # 喺棋盤上畫呢塊積木
for dr, dc in shape: # 逐個 offset
r = row + dr # 實際行
c = col + dc # 實際列
self.board[r][c] = 1 # 填 1(有積木)
# Helper: _erase — 將積木從棋盤擦走(填 0)
def _erase(self, shape, row, col): # 喺棋盤上擦走呢塊積木
for dr, dc in shape: # 逐個 offset
r = row + dr # 實際行
c = col + dc # 實際列
self.board[r][c] = 0 # 填 0(空格)
_is_valid(shape, row, col, exclude_piece_id=None)
逐個 offset check:
1. 有冇出界(row/col 超出 board 範圍)
2. 有冇 collision(個格已經有嘢)
exclude_piece_id 用喺移動時排除自己嘅格仔
_stamp(shape, row, col)
將積木寫入 board(fill 1)
place_piece / move_piece 成功後 call
_erase(shape, row, col)
將積木從 board 擦走(fill 0)
remove_piece / move_piece 前 call
Bank/Gym/FS 嘅 helper = _process_expired(lazy TTL)
Tetris 唔用 TTL,用 collision detection
所以 helper 換咗做 _is_valid / _stamp / _erase
冇 "每個 public method 第一行 call" 嘅 pattern
但 L3 有 _process_line_clears(lazy 消行)
place_piece = 放積木落棋盤 remove_piece = 移走積木 get_board_state = 攞成個棋盤
def place_piece(self, timestamp, shape, row, col): # 放一塊積木落棋盤
if not self._is_valid(shape, row, col): # check collision(出界或撞到人)
return None # 放唔到 → return None
self.piece_counter += 1 # 遞增編號
pid = "p" + str(self.piece_counter) # 造 piece_id("p1", "p2"...)
self.pieces[pid] = { # 記錄呢塊積木嘅資料
"shape": shape, # 佔邊幾格(offset list)
"row": row, # 而家喺第幾行
"col": col, # 而家喺第幾列
}
self._stamp(shape, row, col) # 畫上棋盤(填 1)
self.history.append("placed:" + pid + "@(" + str(row) + "," + str(col) + ")") # 記入���史
return pid # 返 piece_id
def remove_piece(self, timestamp, piece_id): # 移走一塊積木
if piece_id not in self.pieces: # 唔存在
return False # 查無此積木
info = self.pieces[piece_id] # 攞積木資料
self._erase(info["shape"], info["row"], info["col"]) # 從棋盤擦走(填 0)
del self.pieces[piece_id] # 從 pieces dict 刪走
self.history.append("removed:" + piece_id) # 記入歷史
return True # 移走成功
def get_board_state(self, timestamp): # 攞成個棋盤狀態
result = [] # 造一個新 list(避免 caller 改到原本嘅 board)
for r in range(self.rows): # 逐行
row_copy = [] # 呢行嘅 copy
for c in range(self.cols): # 逐格
row_copy.append(self.board[r][c]) # copy 過去
result.append(row_copy) # 入落 result
return result # 返 2D list(deep copy)
def __init__(self, rows, cols):
self.rows = rows
self.cols = cols
self.board = [[0]*cols for _ in range(rows)]
self.pieces = {}
self.piece_counter = 0
self.board = [
[0, 0, 1, 1, 0], # p1 佔 (0,2) 同 (0,3)
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
]
self.pieces = {
"p1": {
"shape": [(0,0),(0,1)], # 橫住兩格
"row": 0, # 第 0 行
"col": 2, # 第 2 列
},
}
# board[row][col] 入面嘅 1 同 pieces 入面嘅 shape + row + col 要一致
# _stamp 負責寫入,_erase 負責擦走
_is_valid(shape, row, col)
place_piece 入面 call,check 有冇出界/collision
_stamp(shape, row, col)
place_piece 成功後 call,將積木畫上棋盤
_erase(shape, row, col)
remove_piece 入面 call,將積木從棋盤擦走
move_piece = 移動積木(左/右/落) get_piece_count = 而家幾多塊 list_pieces = 列晒所有積木
def move_piece(self, timestamp, piece_id, direction): # 移動積木("left"/"right"/"down")
if piece_id not in self.pieces: # 唔存在
return False # 查無此積木
info = self.pieces[piece_id] # 攞積木資料
old_row = info["row"] # 記住舊位置
old_col = info["col"] # 記住舊列
new_row = old_row # 預設唔變
new_col = old_col # 預設唔變
if direction == "left": # 向左
new_col = old_col - 1 # 列數 -1
elif direction == "right": # 向右
new_col = old_col + 1 # 列數 +1
elif direction == "down": # 向下
new_row = old_row + 1 # 行數 +1(越大越低)
else: # 未知方向
return False # 唔支援
self._erase(info["shape"], old_row, old_col) # 先擦走舊位置(避免自己撞自己)
if not self._is_valid(info["shape"], new_row, new_col): # check 新位置
self._stamp(info["shape"], old_row, old_col) # 唔得 → 畫返舊位
return False # 移動失敗
self._stamp(info["shape"], new_row, new_col) # 得 → 畫新位置
info["row"] = new_row # 更新 row
info["col"] = new_col # 更新 col
self.history.append("moved:" + piece_id + ":" + direction) # 記入歷史
return True # 移動成功
def get_piece_count(self, timestamp): # 而家棋盤上幾多塊積木
count = 0 # 由 0 開始數
for pid in self.pieces: # 逐個 piece 行
count += 1 # 數一個
return count # 返總數
def list_pieces(self, timestamp, sort_by): # 列晒所有積木,按 id 或 row 排
items = [] # 暫存 (piece_id, row) tuple
for pid, info in self.pieces.items(): # 逐塊積木攞出嚟
items.append((pid, info["row"])) # 砌做 tuple
if sort_by == "row": # 按 row 排(低嘅排先 = 大 row 排先)
items.sort(key=lambda x: (-x[1], x[0])) # row desc,tie 用 id asc
else: # 預設按 id 排
items.sort(key=lambda x: x[0]) # id asc(字母順序)
parts = [] # 砌 output 字串
for pid, row in items: # 逐個轉做 "pid(row)"
parts.append(pid + "(" + str(row) + ")") # 砌單個 entry
return ", ".join(parts) # 用 ", " 連埋一齊
def __init__(self, rows, cols):
self.rows = rows
self.cols = cols
self.board = [[0]*cols for _ in range(rows)]
self.pieces = {}
self.piece_counter = 0
# 同 L1 一樣,冇加新 field
1. 攞舊位置 (old_row, old_col)
2. 計新位置 (new_row, new_col)
3. _erase 舊位置(先擦走自己,避免自己撞自己)
4. _is_valid 新位置(check 出界/collision)
├ 唔得 → _stamp 返舊位置 → return False
└ 得 → _stamp 新位置 → 更新 info → return True
關鍵:要先 erase 再 check
如果唔 erase 就 check,積木自己嘅格仔會當做 collision
# items.sort(key=lambda x: (-x[1], x[0]))
# -x[1] = -row(大 row = 低位置,排先)
# x[0] = pid(tie-break 用 id 字母升序)
items = [
("p2", 3), # -3 ← 最細,排先(最低位置)
("p1", 0), # -0 ← 排後(最高位置)
]
→ "p2(3), p1(0)"
_is_valid(shape, row, col)
move_piece 入面 call,check 新位置有冇 collision
_stamp(shape, row, col)
move_piece 成功後 call,畫新位置
_erase(shape, row, col)
move_piece 前 call,擦走舊位置
消行 = 成行全部係 1 → 清走 → 上面跌落嚟 score = 總共消咗幾多行 lazy = place/move 之後先 check
def _process_line_clears(self, timestamp): # lazy 消行(place/move 之後 call)
cleared = 0 # 今次消咗幾行
r = self.rows - 1 # 由最底行開始往上睇
while r >= 0: # 逐行往上
full = True # 假設呢行滿咗
for c in range(self.cols): # 逐格 check
if self.board[r][c] == 0: # 有空格
full = False # 唔係滿行
break # 呢行唔使再睇
if full: # 成行全部係 1
del self.board[r] # 刪走呢行
new_row = [] # 造一行全 0 補返頂部
for c in range(self.cols): # 逐格填
new_row.append(0) # 填 0
self.board.insert(0, new_row) # 插入頂部(上面所有行自動跌一格)
cleared += 1 # 計數 +1
self.history.append("cleared:row_" + str(r)) # 記入歷史
else: # 唔係滿行
r -= 1 # 往上一行(滿行唔 -1,因為 del 咗所以下次同一個 r 係新行)
self.score += cleared # 加入總分
if cleared > 0: # 如果有消行
self._rebuild_pieces() # 重建 pieces 嘅 row(因為行跌咗)
return cleared # 返今次消咗幾行
def _rebuild_pieces(self): # 消行後重建 piece 位置
for pid, info in self.pieces.items(): # 逐塊積木
new_cells = [] # 搵返呢塊積木而家實際佔邊幾格
for dr, dc in info["shape"]: # 逐個 offset
r = info["row"] + dr # 舊嘅實際行
c = info["col"] + dc # 舊嘅實際列
if r < self.rows and self.board[r][c] == 1: # 仲喺度
new_cells.append((r, c)) # 記低
if not new_cells: # 成塊積木被消曬
continue # 留住(後面 clean up)
min_r = new_cells[0][0] # 搵最小 row
min_c = new_cells[0][1] # 搵最小 col
for r, c in new_cells: # 搵真正嘅 min
if r < min_r: # 更小嘅 row
min_r = r # 更新
if c < min_c: # 更小嘅 col
min_c = c # 更新
info["row"] = min_r # 更新 row
info["col"] = min_c # 更新 col
def get_score(self, timestamp): # 攞總分(消咗幾多行)
return self.score # 直接返(score 喺 _process_line_clears 入面加)
def __init__(self, rows, cols):
self.rows = rows
self.cols = cols
self.board = [[0]*cols for _ in range(rows)]
self.pieces = {}
self.piece_counter = 0
self.score = 0
消行前:
row 0: [0, 0, 1, 1, 0]
row 1: [1, 1, 1, 1, 1] ← 滿行
row 2: [0, 0, 0, 1, 0]
del board[1] → 刪走滿行
insert(0, [0,0,0,0,0]) → 頂部補一行空行
消行後:
row 0: [0, 0, 0, 0, 0] ← 新嘅空行
row 1: [0, 0, 1, 1, 0] ← 本來嘅 row 0 跌咗一格
row 2: [0, 0, 0, 1, 0] ← 本來嘅 row 2 唔變
del board[r] 之後,原本 r+1 嘅行變咗新 r
所以 del 之後唔 r -= 1,下次 loop 自動 check 新行
如果由上往下掃,del 會跳過下一行
要改:
place_piece 尾加 self._process_line_clears(timestamp)
move_piece 成功後加 self._process_line_clears(timestamp)
新加:
_process_line_clears(timestamp)
_rebuild_pieces()
get_score(timestamp)
drop = 積木直接跌到最低有效位 history = 動作記錄 undo = 回退最後一個動作
def drop_piece(self, timestamp, piece_id): # 積木直接跌到最底
if piece_id not in self.pieces: # 唔存在
return False # 查無此積木
info = self.pieces[piece_id] # 攞積木資料
shape = info["shape"] # 攞 shape
old_row = info["row"] # 記住起始行
old_col = info["col"] # 記住列
self._erase(shape, old_row, old_col) # 先擦走(避免自撞)
final_row = old_row # 記住最後有效位置
test_row = old_row + 1 # 由下一行開始試
while test_row < self.rows: # 唔超出底部
if self._is_valid(shape, test_row, old_col): # 呢行得唔得
final_row = test_row # 記住(繼續試更低)
test_row += 1 # 再落一行
else: # 落唔到
break # 停喺上一行
self._stamp(shape, final_row, old_col) # 畫喺最低有效位
info["row"] = final_row # 更新 row
self.history.append("dropped:" + piece_id + "@(" + str(final_row) + "," + str(old_col) + ")") # 記入歷史
self._process_line_clears(timestamp) # drop 之後 check 消行
return True # drop 成功
def get_history(self, timestamp): # 攞所有動作記錄
result = [] # 造 copy
for entry in self.history: # 逐條記錄
result.append(entry) # copy 入 result
return result # 返 list of string
def undo_last(self, timestamp): # 回退最後一個動作
if not self.history: # 冇嘢可以 undo
return False # 空嘅 → False
last = self.history[-1] # 攞最後一條
self.history.pop() # 刪走最後一條
if last.startswith("placed:"): # 上次係 place → undo = remove
pid = last.split(":")[1].split("@")[0] # 攞 piece_id
if pid in self.pieces: # 仲喺度
info = self.pieces[pid] # 攞 info
self._erase(info["shape"], info["row"], info["col"]) # 從棋盤擦走
del self.pieces[pid] # 刪走 piece
elif last.startswith("removed:"): # 上次係 remove → undo = 放返(simplified)
pass # simplified:remove 嘅 undo 需要額外 snapshot,此處省略
elif last.startswith("moved:"): # 上次係 move → undo = 移返反方向
parts = last.split(":") # ["moved", "p1", "left"]
pid = parts[1] # piece_id
direction = parts[2] # 原方向
reverse = {"left": "right", "right": "left", "down": "up"} # 反方向 map
rev_dir = reverse.get(direction, "") # 攞反方向
if pid in self.pieces and rev_dir: # 積木仲喺度 + 有反方向
self.move_piece(timestamp, pid, rev_dir) # 移返去
self.history.pop() # move 會加 history,要 pop 走
return True # undo 成功
def __init__(self, rows, cols):
self.rows = rows
self.cols = cols
self.board = [[0]*cols for _ in range(rows)]
self.pieces = {}
self.piece_counter = 0
self.score = 0
self.history = []
1. _erase 舊位置(同 move 一樣,先擦走)
2. 由 old_row+1 開始逐行往下試
└ _is_valid → 得就繼續往下,唔得就 break
3. _stamp 最後有效位置
4. 更新 info["row"]
5. _process_line_clears(drop 後好大機會消行)
同 move_piece("down") 嘅分別:
move = 一次一格
drop = 一直落到底
self.history = [
"placed:p1@(0,2)", # place_piece 產生
"moved:p1:down", # move_piece 產生
"cleared:row_4", # _process_line_clears 產生
"dropped:p2@(3,1)", # drop_piece 產生
"removed:p1", # remove_piece 產生
]
placed → remove 返(erase + del)
moved → move 返反方向
removed → 需要 snapshot(simplified,此處 pass)
cleared → 極難 undo(需要記住被消嘅行)
dropped → 需要記住原始 row(simplified)
考試通常只要求 undo place 同 move
batch = 一拼做幾單嘢 lock = 鎖 per-piece lock = 每個 piece_id 一把鎖
async def batch_operations(self, timestamp, ops): # 一次過做一堆 place/move/remove
results = [] # 暫存每個 op 嘅結果
for op in ops: # 順住 input 順序逐個做
op_type = op["type"] # 攞 op 類型
if op_type == "place": # place 類型
shape = op["shape"] # 攞 shape
row = op["row"] # 攞 row
col = op["col"] # 攞 col
pid = "p" + str(self.piece_counter + 1) # 預計嘅 piece_id
async with self.locks[pid]: # 鎖呢個 piece_id
result = self.place_piece(timestamp, shape, row, col) # call L1
results.append(result) # 記返結果(pid 或 None)
elif op_type == "move": # move 類型
pid = op["piece_id"] # 攞 piece_id
direction = op["direction"] # 攞方向
async with self.locks[pid]: # 鎖呢個 piece_id
ok = self.move_piece(timestamp, pid, direction) # call L2
results.append(ok) # 記返結果
elif op_type == "remove": # remove 類型
pid = op["piece_id"] # 攞 piece_id
async with self.locks[pid]: # 鎖呢個 piece_id
ok = self.remove_piece(timestamp, pid) # call L1
results.append(ok) # 記返結果
else: # 其他 type 唔 support
results.append(False) # 一律 False
return results # 返一個同 input 一樣長嘅 list
def __init__(self, rows, cols):
self.rows = rows
self.cols = cols
self.board = [[0]*cols for _ in range(rows)]
self.pieces = {}
self.piece_counter = 0
self.score = 0
self.history = []
self.locks = defaultdict(asyncio.Lock)
self.locks = {
"p1": <asyncio.Lock>, # defaultdict 一 access 就自動造
"p2": <asyncio.Lock>,
}
# 每個 piece_id 一把獨立鎖
# 兩個 op 鎖唔同 piece → 可以並行
# 兩個 op 鎖同一個 piece → 後嗰個會等
新加:
batch_operations(timestamp, ops) ← async
init 多咗:
self.locks = defaultdict(asyncio.Lock)
無改 L1/L2/L3/L4 嘅 sync method(batch 入面 call 返佢哋)
Gym:op key = member_id
Tetris:op key = piece_id
Pattern 一樣:for loop + async with lock[key] + call sync method
唯一分別:place 嘅 lock key 係預計嘅 pid(因為 place 前未有 id)
sync_boards = 將積木 transfer 去另一塊板 semaphore = 限制同時做嘅 transfer 數 fail-fast = piece 唔存在即走
async def sync_boards(self, timestamp, transfers, max_concurrent): # 並行做一堆 transfer,限 N 個 concurrent
sem = asyncio.Semaphore(max_concurrent) # 開一個 N 位嘅 semaphore
tasks = [] # 暫存所有 coroutine task
for transfer in transfers: # 逐個 transfer 包做一個 task
task = self._do_one_sync(timestamp, transfer, sem) # 起 coroutine(未 await)
tasks.append(task) # 入 list
results = await asyncio.gather(*tasks) # 並發跑,等全部完,保留順序
final = [] # 轉做正常 list
for r in results: # 逐個 copy 過
final.append(r) # 入 list
return final # 返一個同 transfers 一樣長嘅 list[bool]
async def _do_one_sync(self, timestamp, transfer, sem): # 做單一 transfer(async helper)
pid = transfer["piece_id"] # 攞 piece_id
dest = transfer["destination"] # 攞目的地(另一塊板)
# fail-fast:未攞 semaphore 之前已經 check(唔阻住其他 task)
if pid not in self.pieces: # piece 唔存在
return False # 即刻 False,唔 acquire semaphore
async with sem: # 過咗 fail-fast 先攞 semaphore(限速)
await asyncio.sleep(0.01) # 模擬 transfer 嘅延遲(10ms)
info = self.pieces[pid] # 攞積木資料
self._erase(info["shape"], info["row"], info["col"]) # 從本板擦走
del self.pieces[pid] # 刪走 piece
self.history.append("synced:" + pid + "->" + dest) # 記入歷史
return True # transfer 成功
def __init__(self, rows, cols):
self.rows = rows
self.cols = cols
self.board = [[0]*cols for _ in range(rows)]
self.pieces = {}
self.piece_counter = 0
self.score = 0
self.history = []
self.locks = defaultdict(asyncio.Lock)
# semaphore 喺 method 入面開(per-call)
self.pieces = {
"p1": {"shape": [(0,0),(0,1)], "row": 0, "col": 2},
}
# 同前完全一樣
# semaphore 唔放入 self(每次 sync_boards 都重新開一個 N 位)
新加:
_do_one_sync(timestamp, transfer, sem) ← async helper
sync_boards(timestamp, transfers, max_concurrent) ← async public
特別注意:
fail-fast check 寫喺 acquire sem 之前
唔好 acquire 咗 sem 先 check,否則「失敗嘅 task」都會白佔位
transfer 成功後要 erase + del piece(離開本板)
同 Gym/Bank/FS 完全一樣嘅 pattern:
1. fail-fast check(sem 外面)
2. async with sem:(限速)
3. await asyncio.sleep(0.01)(模擬延遲)
4. 做嘢 + return True
唯一分別:
Gym: checkout member(改 field)
Tetris: erase + del piece(從棋盤移除)
一個拍賣系統:開拍賣 → 出價 → 截標 → 揀贏家 → 交收。類似 eBay 嘅 in-memory 版本。
# Auction = 拍賣系統,dict-of-dicts
# 核心 data structure:
self.auctions = {} # auction_id → {item, starting_price, bids, status, expires_at, reserve_price}
# 每個 method 第一個 param 都係 timestamp
# L3 開始每個 method 開頭都 call self._process_expired_auctions(timestamp)
# 比喻:
# auction = 拍賣(一件物品攤出嚟賣)
# bid = 出價(買家舉牌叫價)
# bidder = 買家(舉牌嗰個人)
# reserve = 底價(賣家設嘅最低成交價,未到就流標)
# settle = 交收(拍賣完畢,正式收錢交貨)
# deadline = 截止時間(過咗就收檔)
# Level 進程:
# L1: CRUD(開拍賣 / 出價 / 查詢)
# L2: Sort/Filter/Format(排拍賣 + 排出價)
# L3: TTL/Deadline/Lazy(截標 + 揀贏家)
# L4: Reserve Price + History(底價 + 出價歷史 + 取消)
# L5: Concurrent Batch(async gather + lock per auction)
# L6: Settlement(交收 + semaphore + fail-fast)
Auction 同其他 system 嘅分別:
Bank 用 balance 增減(轉賬)
Auction 用 bids list 記錄出價(只加唔減)
Auction 嘅特色:
1. 每個 method 都有 timestamp param
2. L3 開始有 lazy processing(_process_expired_auctions)
3. bid 要 > current highest(唔係 >=)
4. status 有 "OPEN" / "CLOSED" / "CANCELLED"
5. settle 要 CLOSED + has winner 先過
auctions 實際樣子:
{
"auc1": {
"item": "painting",
"starting_price": 100,
"bids": [
{"bidder": "b1", "amount": 150, "time": 1000},
{"bidder": "b2", "amount": 200, "time": 2000}
],
"status": "OPEN",
"expires_at": None,
"reserve_price": None
}
}
每個 auction 有六個 field:
item = 拍賣物品名
starting_price = 起拍價(第一口叫價要 > 呢個)
bids = 所有出價記錄 list
status = "OPEN" / "CLOSED" / "CANCELLED"
expires_at = 截止時間(None = 冇 deadline)
reserve_price = 底價(None = 冇底價限制)
import asyncio # L5 async 用
from collections import defaultdict # L5 auto-create lock 用
class AuctionSystem:
def __init__(self):
self.auctions = {} # L1 — auction_id → {item, starting_price, bids, status, ...}
self.auction_locks = defaultdict(asyncio.Lock) # L5 加 — 每個 auction 一把鎖
self.auctions 實際樣子:
{
"auc1": {
"item": "painting",
"starting_price": 100,
"bids": [
{"bidder": "b1", "amount": 150, "time": 1000}
],
"status": "OPEN",
"expires_at": None,
"reserve_price": None
}
}
# 點攞 data:
auc = self.auctions["auc1"] # pointer
auc["bids"] # list of bid dicts
auc["bids"][-1]["amount"] # 最後一個 bid 嘅金額
auc["status"] # "OPEN" / "CLOSED" / "CANCELLED"
# Helper: _process_expired_auctions — 每個 method 開頭都 call(L3 開始)
# Lazy processing:唔會自動截標,要等有人 call method 帶 timestamp 入嚟先 check
def _process_expired_auctions(self, timestamp): # 行晒所有 auction,到期嘅標記 CLOSED
for aid, auc in self.auctions.items(): # 逐個 auction 睇
if auc["status"] == "OPEN" and auc["expires_at"] is not None and timestamp >= auc["expires_at"]: # 仲開緊 + 有 deadline + 到期
auc["status"] = "CLOSED" # 截標!改做 CLOSED(唔收新 bid)
_process_expired_auctions 嘅職責:
每次有人 call 任何 method 帶 timestamp 入嚟
→ 行晒 self.auctions dict
→ 搵仲 OPEN + 有 expires_at + 到期 (timestamp >= expires_at) 嘅
→ 將 status 改做 "CLOSED"
# 點解叫 "lazy"?
# 因為唔係到期就自動 close
# 要等下一次有人 call method 先 check
# 例如 expires_at = 5000
# timestamp=4999 call place_bid → 4999 < 5000 → 仲 OPEN → 可以出價
# timestamp=5000 call get_auction → 5000 >= 5000 → close 咗!
# 點解只 check status == "OPEN"?
# CLOSED / CANCELLED 嘅唔使再處理
開拍賣(create_auction),出價(place_bid),查詢(get_auction, get_highest_bid)。每個 method check 存唔存在 + return 適當嘅值。
def create_auction(self, timestamp, auction_id, item_name, starting_price): # 開一場新拍賣
self._process_expired_auctions(timestamp) # L3 加:先處理到期嘅 auction
if auction_id in self.auctions: # 呢個 auction ID 用咗未?
return False # 已經有 → 拒絕重複開
self.auctions[auction_id] = { # 開個新拍賣,記低資料
"item": item_name, # 拍賣物品名
"starting_price": starting_price, # 起拍價
"bids": [], # 未有人出價
"status": "OPEN", # 開放接受出價
"expires_at": None, # L3 加:冇 deadline(永唔截標)
"reserve_price": None # L4 加:冇底價限制
}
return True # 開成功
def place_bid(self, timestamp, auction_id, bidder_id, amount): # 買家出價
self._process_expired_auctions(timestamp) # L3 加:先處理到期嘅 auction
if auction_id not in self.auctions: # auction 唔存在?
return False # 搵唔到 → 唔做
auc = self.auctions[auction_id] # 攞出嚟(pointer)
if auc["status"] != "OPEN": # 拍賣仲開緊?
return False # 已經 CLOSED 或 CANCELLED → 唔收新 bid
# 計算而家最高價:有 bid 就攞最後一個嘅 amount,冇就用 starting_price
current_highest = auc["bids"][-1]["amount"] if auc["bids"] else auc["starting_price"]
if amount <= current_highest: # 出價要嚴格大於而家最高(唔係 >=)
return False # 唔夠高 → 拒絕
auc["bids"].append({"bidder": bidder_id, "amount": amount, "time": timestamp}) # 記錄出價
return True # 出價成功
create_auction(1, "auc1", "painting", 100) 之後:
self.auctions = {
"auc1": {
"item": "painting",
"starting_price": 100,
"bids": [],
"status": "OPEN",
"expires_at": None,
"reserve_price": None
}
}
place_bid(2, "auc1", "b1", 150) 之後:
self.auctions["auc1"]["bids"] = [
{"bidder": "b1", "amount": 150, "time": 2}
]
# return True(150 > 100 起拍價)
place_bid(3, "auc1", "b2", 120) → False
# 120 <= 150(而家最高)→ 拒絕
place_bid(4, "auc1", "b2", 200) 之後:
self.auctions["auc1"]["bids"] = [
{"bidder": "b1", "amount": 150, "time": 2},
{"bidder": "b2", "amount": 200, "time": 4}
]
# current_highest 點計:
# 有 bid → bids[-1]["amount"] = 150
# 冇 bid → starting_price = 100
# 新 bid 要 > current_highest 先收
def get_auction(self, timestamp, auction_id): # 查拍賣資料
self._process_expired_auctions(timestamp) # L3 加:先處理到期嘅 auction
if auction_id not in self.auctions: # 唔存在?
return None # 搵唔到
return self.auctions[auction_id] # 回傳成個 dict(pointer)
def get_highest_bid(self, timestamp, auction_id): # 攞最高出價
self._process_expired_auctions(timestamp) # L3 加:先處理到期嘅 auction
if auction_id not in self.auctions: # 唔存在?
return None # 搵唔到
auc = self.auctions[auction_id] # 攞 auction pointer
if not auc["bids"]: # 冇人出過價?
return None # 冇 highest bid
top = auc["bids"][-1] # 最後一個 bid 就係最高(因為 place_bid 保證遞增)
return (top["bidder"], top["amount"]) # 回傳 tuple (bidder_id, amount)
get_auction(5, "auc1") →
{
"item": "painting",
"starting_price": 100,
"bids": [{"bidder":"b1","amount":150,"time":2}, ...],
"status": "OPEN",
"expires_at": None,
"reserve_price": None
}
get_highest_bid(5, "auc1") →
("b2", 200)
get_highest_bid(5, "auc_empty") →
None(冇人出價)
點解 bids[-1] 就係最高?
因為 place_bid 保證每次新 bid > 之前最高
所以 bids list 天然係 ascending order
最尾嗰個 = 最高
排拍賣(list_auctions):按 ID 或 highest bid desc 排。排出價(list_bids):按 amount desc。Return format "auction_id(highest_bid)"。
def list_auctions(self, timestamp, sort_by): # 列出所有拍賣,format "id(highest_bid)"
self._process_expired_auctions(timestamp) # 先處理到期 auction
def get_highest(auc): # helper:攞一個 auction 嘅最高出價
if auc["bids"]:
return auc["bids"][-1]["amount"] # 有 bid → 最後一個
return 0 # 冇 bid → 0
if sort_by == "id": # 按 auction_id 字母排
sorted_items = sorted(self.auctions.items(), key=lambda x: x[0])
elif sort_by == "price": # 按 highest bid desc 排
sorted_items = sorted(self.auctions.items(), key=lambda x: -get_highest(x[1]))
else:
sorted_items = list(self.auctions.items())
result = [] # 準備裝 format 完嘅 string
for aid, auc in sorted_items: # 逐個 auction 行
highest = get_highest(auc)
result.append(f"{aid}({highest})") # 格式化成 "auc1(200)"
return result
def list_bids(self, timestamp, auction_id): # 列出一個 auction 嘅所有出價,按 amount desc
self._process_expired_auctions(timestamp) # 先處理到期 auction
if auction_id not in self.auctions: # 唔存在?
return [] # 空 list
auc = self.auctions[auction_id] # 攞 auction pointer
return sorted(auc["bids"], key=lambda x: -x["amount"]) # 按金額 desc 排
list_auctions(10, "price") 點運作:
auctions = {"auc1": ...(highest=200), "auc2": ...(highest=50)}
sort_by="price" → sorted by -get_highest(auc)
→ [("auc1", ...), ("auc2", ...)]
→ ["auc1(200)", "auc2(50)"]
sort_by="id" → sorted by auction_id alphabetically
→ [("auc1", ...), ("auc2", ...)]
→ ["auc1(200)", "auc2(50)"]
list_bids(10, "auc1") →
bids 原本:[{amt:150, time:2}, {amt:200, time:4}]
sorted by -amount:
[{amt:200, time:4}, {amt:150, time:2}]
注意:list_bids return 成個 bid dict list
唔係 format string
create_auction_with_deadline 開有期限拍賣。_process_expired_auctions lazy 截標。get_winner 攞贏家。
def create_auction_with_deadline(self, timestamp, auction_id, item_name, starting_price, duration_ms): # 開有期限嘅拍賣
self._process_expired_auctions(timestamp) # 先處理到期嘅 auction
if auction_id in self.auctions: # 已經有?
return None # 拒絕重複
expires_at = timestamp + duration_ms # 計算截止時間
self.auctions[auction_id] = { # 開新拍賣
"item": item_name,
"starting_price": starting_price,
"bids": [],
"status": "OPEN",
"expires_at": expires_at, # 有截止時間!
"reserve_price": None
}
return expires_at # 回傳截止時間
def get_winner(self, timestamp, auction_id): # 攞贏家
self._process_expired_auctions(timestamp) # 先處理到期嘅 auction
if auction_id not in self.auctions: # 唔存在?
return None
auc = self.auctions[auction_id] # 攞 auction pointer
if auc["status"] != "CLOSED": # 未截標?
return None # 仲未有 winner
if not auc["bids"]: # 截咗標但冇人出價?
return None # 流拍(冇 winner)
# L4 加:check reserve price
if auc["reserve_price"] is not None and auc["bids"][-1]["amount"] < auc["reserve_price"]: # 最高出價未到底價?
return None # 未過底價 → 流拍
return auc["bids"][-1]["bidder"] # 最高出價者 = winner
create_auction_with_deadline(1000, "auc2", "vase", 50, 3000)
→ expires_at = 1000 + 3000 = 4000
→ return 4000
之後如果 timestamp >= 4000 嘅 method 被 call
→ _process_expired_auctions 會將 auc2 改做 CLOSED
get_winner 嘅邏輯:
1. 未 CLOSED → None(仲拍緊)
2. CLOSED + 冇 bid → None(流拍)
3. CLOSED + 有 bid + 未到 reserve → None(L4 流拍)
4. CLOSED + 有 bid + 過 reserve → bids[-1]["bidder"]
例子:
auc2 有 bids: [{bidder:"b1", amount:80, time:2000}]
timestamp=5000 call get_winner
→ _process_expired_auctions(5000):5000 >= 4000 → CLOSED
→ status == "CLOSED" ✓
→ bids 有嘢 ✓
→ reserve_price == None → skip check
→ return "b1"
set_reserve_price 設底價。get_history 攞出價歷史。cancel_auction 取消拍賣。
def set_reserve_price(self, timestamp, auction_id, min_price): # 設底價(賣家嘅最低接受價)
self._process_expired_auctions(timestamp) # 先處理到期嘅 auction
if auction_id not in self.auctions: # 唔存在?
return False
auc = self.auctions[auction_id] # 攞 auction pointer
if auc["status"] != "OPEN": # 仲開緊先可以改底價
return False
auc["reserve_price"] = min_price # 設底價
return True
def get_history(self, timestamp, auction_id): # 攞出價歷史(所有 bid events)
self._process_expired_auctions(timestamp) # 先處理到期嘅 auction
if auction_id not in self.auctions: # 唔存在?
return []
return list(self.auctions[auction_id]["bids"]) # copy 返個 list(唔好俾 caller 直接改原本)
def cancel_auction(self, timestamp, auction_id): # 取消拍賣(refund 所有 bidder)
self._process_expired_auctions(timestamp) # 先處理到期嘅 auction
if auction_id not in self.auctions: # 唔存在?
return False
auc = self.auctions[auction_id] # 攞 auction pointer
if auc["status"] != "OPEN": # 只有 OPEN 嘅先可以取消
return False
auc["status"] = "CANCELLED" # 標記取消
# refund all bidders — 呢度 in-memory 版本冇真正扣過錢
# 所以 "refund" = 清空 bids(或者記低 cancelled 狀態讓外部處理)
return True # 取消成功
set_reserve_price(5, "auc1", 180) →
auc1["reserve_price"] = 180
如果截標時最高 bid < 180 → 流拍(get_winner return None)
get_history(5, "auc1") →
[
{"bidder": "b1", "amount": 150, "time": 2},
{"bidder": "b2", "amount": 200, "time": 4}
]
注意:return 嘅係 copy(list(...)),唔係原本 pointer
cancel_auction(5, "auc1") →
auc1["status"] = "CANCELLED"
return True
之後 place_bid("auc1", ...) → False(status != "OPEN")
之後 get_winner("auc1") → None(status != "CLOSED")
Reserve price 嘅流拍邏輯:
auc1 bids = [{amt:150}, {amt:170}]
reserve_price = 180
截標後 get_winner:
→ bids[-1]["amount"] = 170 < 180 → return None(流拍)
如果 bids = [{amt:150}, {amt:200}]
→ bids[-1]["amount"] = 200 >= 180 → return "b2"(成交)
同時處理多個 operations。Skeleton:async def + execute_op + gather + lock per auction_id。
# 第一步:先定義點樣處理單一 op
# 第二步:按操作嘅 auction_id 去攞鎖
# 第三步:收集晒 coroutine,再一次過 gather
async def batch_operations(self, timestamp, ops): # 批量處理一堆拍賣動作
self._process_expired_auctions(timestamp) # 開工前先處理到期 auction
async def execute_op(op): # 每次只處理一張單
aid = op["auction_id"] # 呢張單涉及邊個 auction
async with self.auction_locks[aid]: # 同一個 auction 一次只畀一張單改
if op["type"] == "create": # 開新拍賣
return self.create_auction(timestamp, op["auction_id"], op["item_name"], op["starting_price"])
elif op["type"] == "bid": # 出價
return self.place_bid(timestamp, op["auction_id"], op["bidder_id"], op["amount"])
elif op["type"] == "cancel": # 取消
return self.cancel_auction(timestamp, op["auction_id"])
# 收集所有 coroutine,一次過 gather
tasks = [] # 待辦 list
for op in ops: # 行晒每個 operation
tasks.append(execute_op(op)) # 包成 coroutine
results = await asyncio.gather(*tasks) # 一次過放出去跑
return list(results) # 回傳每個 op 嘅結果
# L5 嘅 __init__:
def __init__(self):
self.auctions = {}
self.auction_locks = defaultdict(asyncio.Lock) # L5 加
# Lock 點用:
# 每個 auction_id 一把鎖
# 同一個 auction 嘅 op 排隊
# 唔同 auction 嘅 op 可以同時跑
# 同 Bank L5 嘅分別:
# Bank transfer 涉及兩個 account → sorted 防 deadlock
# Auction 每個 op 只涉及一個 auction → 一把鎖就夠
# 唔使 sorted!因為冇跨 auction 操作
# batch_operations 例子:
# ops = [
# {"type": "bid", "auction_id": "auc1", "bidder_id": "b1", "amount": 200},
# {"type": "bid", "auction_id": "auc1", "bidder_id": "b2", "amount": 250},
# {"type": "create", "auction_id": "auc3", "item_name": "car", "starting_price": 1000}
# ]
# → auc1 嘅兩個 bid 會排隊(同一把鎖)
# → auc3 嘅 create 唔使等(唔同鎖)
同 L5 一樣但加 Semaphore 限制同時幾多個。Fail-fast:auction 必須 CLOSED + has winner 先過。過咗先入 sem + sleep。
# 第一步:先做 fail-fast 檢查(CLOSED + has winner)
# 第二步:過關嘅先入 semaphore,模擬交收過程(sleep)
async def settle_auctions(self, timestamp, settlements, max_concurrent): # 一批交收同時做,但限流
self._process_expired_auctions(timestamp) # 開工前先處理到期 auction
sem = asyncio.Semaphore(max_concurrent) # 同時最多 N 個交收
async def do_settle(s): # 每次處理一單交收
aid = s["auction_id"] # 呢單交收邊個 auction
lock = self.auction_locks[aid] # 攞鎖
# 第一步:fail-fast 檢查
async with lock: # 鎖住 auction
if aid not in self.auctions: # auction 唔存在
return False # 即刻失敗,唔入 sem
auc = self.auctions[aid] # 攞 auction
if auc["status"] != "CLOSED": # 未截標?
return False # 唔可以交收
if not auc["bids"]: # 冇人出價?
return False # 冇 winner → 唔可以交收
# L4 check:reserve price
if auc["reserve_price"] is not None and auc["bids"][-1]["amount"] < auc["reserve_price"]:
return False # 未過底價 → 流拍 → 唔交收
# 第二步:過關先入 semaphore,模擬交收
async with sem: # 搶交收窗口
await asyncio.sleep(0.01) # 模擬交收過程(付款、發貨)
return True # 交收成功
tasks = [] # 收集每單交收嘅 coroutine
for s in settlements: # 行晒成批交收單
tasks.append(do_settle(s)) # 包成 coroutine
results = await asyncio.gather(*tasks) # 全部同時跑
return list(results) # 回傳每單係成功定失敗
settle_auctions 嘅 fail-fast 邏輯:
1. auction 唔存在 → False(唔入 sem)
2. status != "CLOSED" → False(仲拍緊或已取消)
3. 冇 bids → False(冇 winner)
4. reserve_price 有值 + 最高 bid < reserve → False(流拍)
5. 全部 pass → 入 sem + sleep → True
例子:
settlements = [
{"auction_id": "auc1"}, # CLOSED, winner=b2 → True
{"auction_id": "auc2"}, # OPEN → False(未截標)
{"auction_id": "auc3"}, # CLOSED, no bids → False
{"auction_id": "auc99"}, # 唔存在 → False
]
max_concurrent = 2
→ results = [True, False, False, False]
→ 只有 auc1 真正入咗 sem + sleep
L5 vs L6 分別:
L5 batch_operations:gather + Lock
所有 op 同時跑,鎖住 auction 防 race
L6 settle_auctions:gather + Lock + Semaphore
Lock 包住 fail-fast check
Sem 包住外部交收(限制同時幾多個)
分開用!Lock 先,Sem 後
Fail 嘅唔入 sem、唔 sleep
一個圖書館系統:加書 → 借書 → 還書 → 搜尋 → 過期罰款 → 預約排隊 → 批量操作 → 並行同步。
# Library = Flat dict + Lazy overdue 系統
# 核心 data structure:
self.books = {} # book_id → {title, borrowed_by, borrowed_at, expires_at, borrow_count, reservations, history}
# 每個 method 第一個 param 都係 timestamp
# L3 開始每個 method 開頭都 call self._process_overdue(timestamp)
# Level 進程:
# L1: CRUD(加書 / 借書 / 還書 / 查書)
# L2: Sort/Search(排序 / prefix 搜尋)
# L3: TTL/Lazy(借書到期 + 自動還 + 罰款)
# L4: Reservation + History(預約排隊 + 事件歷史)
# L5: Concurrent Batch(async gather + lock per book_id)
# L6: Rate Limited Sync(+ semaphore + sleep)
Library 同其他 system 嘅分別:
Bank 用 account → balance(數字加減)
Library 用 book → borrowed_by(狀態切換)
Library 嘅特色:
1. 每個 method 都有 timestamp param
2. L3 開始有 lazy processing(_process_overdue)
3. borrow / return 係狀態機(None ↔ user_id)
4. reservations 係 queue(list.pop(0) = FIFO)
5. 還書時自動將下一個預約者借出
books 實際樣子:
{
"book1": {
"title": "Python Cookbook",
"borrowed_by": "alice",
"borrowed_at": 1000,
"expires_at": None,
"borrow_count": 2,
"reservations": ["bob", "charlie"],
"history": [
{"ts": 500, "action": "add"},
{"ts": 1000, "action": "borrow", "user": "alice"}
]
}
}
import asyncio # L5 async 用
from collections import defaultdict # L5 auto-create lock 用
class LibrarySystem:
def __init__(self):
self.books = {} # L1 — book_id → {title, borrowed_by, borrowed_at, expires_at, borrow_count, reservations, history}
self.book_locks = defaultdict(asyncio.Lock) # L5 加 — 每本書一把鎖
self.books 實際樣子:
{
"book1": {
"title": "Python Cookbook",
"borrowed_by": None, # None = 書架上;有值 = 被借走
"borrowed_at": None, # 幾時被借(timestamp)
"expires_at": None, # L3:到期時間(None = 冇期限)
"borrow_count": 0, # 總共被借過幾次
"reservations": [], # L4:預約排隊嘅 user list
"history": [] # L4:事件歷史
}
}
點攞 data:
book = self.books["book1"]
print(book["borrowed_by"])
# None → 未被借
book["borrowed_by"] = "alice"
# 而家 book1 被 alice 借咗(pointer 直接改)
# borrowed_by 嘅狀態:
# None = 可借 / 書架上
# "alice" = 已被 alice 借走
# Helper: _process_overdue — 每個 method 開頭都 call(L3 開始)
# Lazy processing:唔會自動到期,要等有人 call method 帶 timestamp 入嚟先 check
def _process_overdue(self, timestamp): # 行晒所有書,過期嘅自動還
for book_id, book in self.books.items(): # 逐本書睇
if book["borrowed_by"] is not None and book["expires_at"] is not None and timestamp >= book["expires_at"]: # 有人借 + 有期限 + 過期
book["borrowed_by"] = None # 自動還書,放返上書架
book["borrowed_at"] = None # 清除借出時間
book["expires_at"] = None # 清除到期時間
# L4 加:如果有人排緊隊,自動借畀下一位
if book["reservations"]: # 有冇人預約緊?
next_user = book["reservations"].pop(0) # FIFO:攞排最前嗰個
book["borrowed_by"] = next_user # 自動借畀佢
book["borrowed_at"] = timestamp # 用而家嘅時間做借出時間
book["borrow_count"] += 1 # 借出次數 +1
book["history"].append({"ts": timestamp, "action": "borrow", "user": next_user}) # 記低歷史
_process_overdue 嘅職責:
每次有人 call 任何 method 帶 timestamp 入嚟
→ 行晒 self.books dict
→ 搵有人借 + 有期限 + 已過期嘅
→ 自動還書(borrowed_by = None)
→ 如果有預約排隊,自動借畀下一位
# 點解叫 "lazy"?
# 因為唔係到期就自動還
# 要等下一次有人 call method 先 check
# 例如 expires_at = 5000
# 但如果 timestamp=4999 call get_book
# → check: 4999 < 5000 → 唔還
# timestamp=5001 call list_books
# → check: 5001 >= 5000 → 自動還!
overdue 後有人排隊嘅流程:
book1 被 alice 借咗,expires_at = 5000
bob 預約咗 book1 → reservations = ["bob"]
timestamp=5001 call 任何 method:
1. _process_overdue 發現 book1 過期
2. borrowed_by = None(alice 還咗)
3. reservations 有人!pop(0) → "bob"
4. borrowed_by = "bob"(自動借畀 bob)
5. borrow_count += 1
結果:book1 直接由 alice 手上跳去 bob
加書(add_book),借書(borrow_book),還書(return_book),查書(get_book)。每個 method check 書存唔存在 + return 適當嘅值。
def add_book(self, timestamp, book_id, title): # 加一本新書入圖書館
self._process_overdue(timestamp) # L3 加:先處理過期嘅書
if book_id in self.books: # 呢本書已經有?
return False # 已經有 → 拒絕重複加
self.books[book_id] = { # 開一本新書嘅記錄
"title": title, # 書名
"borrowed_by": None, # 冇人借住
"borrowed_at": None, # 冇借出時間
"expires_at": None, # 冇到期時間
"borrow_count": 0, # 借出次數
"reservations": [], # L4 加:預約排隊
"history": [{"ts": timestamp, "action": "add"}] # L4 加:事件歷史
}
return True # 加成功
def borrow_book(self, timestamp, book_id, user_id): # 借書
self._process_overdue(timestamp) # L3 加:先處理過期嘅書
if book_id not in self.books: # 書唔存在?
return False # 搵唔到呢本書
book = self.books[book_id] # 攞出嚟(pointer)
if book["borrowed_by"] is not None: # 已經有人借咗?
return False # 已經被借走 → 唔畀再借
book["borrowed_by"] = user_id # 記低邊個借
book["borrowed_at"] = timestamp # 記低幾時借
book["borrow_count"] += 1 # 借出次數 +1
book["history"].append({"ts": timestamp, "action": "borrow", "user": user_id}) # L4 加
return True # 借成功
add_book(1, "book1", "Python Cookbook") 之後:
self.books = {
"book1": {
"title": "Python Cookbook",
"borrowed_by": None,
"borrowed_at": None,
"expires_at": None,
"borrow_count": 0,
"reservations": [],
"history": [{"ts": 1, "action": "add"}]
}
}
borrow_book(2, "book1", "alice") 之後:
self.books = {
"book1": {
"title": "Python Cookbook",
"borrowed_by": "alice", # None → "alice"
"borrowed_at": 2, # None → 2
"expires_at": None, # L1 冇 TTL
"borrow_count": 1, # 0 → 1
"reservations": [],
"history": [
{"ts": 1, "action": "add"},
{"ts": 2, "action": "borrow", "user": "alice"}
]
}
}
borrow 嘅 fail 情景:
borrow_book(3, "book1", "bob") → False
因為 book1 已被 alice 借走
borrowed_by = "alice"(不是 None)
borrow_book(3, "book99", "bob") → False
因為 book99 唔存在
def return_book(self, timestamp, book_id): # 還書
self._process_overdue(timestamp) # L3 加:先處理過期嘅書
if book_id not in self.books: # 書唔存在?
return False # 搵唔到呢本書
book = self.books[book_id] # 攞出嚟
if book["borrowed_by"] is None: # 冇人借住?
return False # 書本身就喺架上 → 冇得還
book["history"].append({"ts": timestamp, "action": "return", "user": book["borrowed_by"]}) # L4 加
book["borrowed_by"] = None # 清除借出者
book["borrowed_at"] = None # 清除借出時間
book["expires_at"] = None # 清除到期時間
# L4 加:如果有人排緊隊,自動借畀下一位
if book["reservations"]: # 有冇人預約緊?
next_user = book["reservations"].pop(0) # FIFO:攞排最前嗰個
book["borrowed_by"] = next_user # 自動借畀佢
book["borrowed_at"] = timestamp # 用而家嘅時間做借出時間
book["borrow_count"] += 1 # 借出次數 +1
book["history"].append({"ts": timestamp, "action": "borrow", "user": next_user}) # 記低歷史
return True # 還成功
def get_book(self, timestamp, book_id): # 查書資料
self._process_overdue(timestamp) # L3 加:先處理過期嘅書
if book_id not in self.books: # 書唔存在?
return None # 搵唔到 → return None
return self.books[book_id] # return 成個 dict(pointer)
return_book(5, "book1") 之後(冇預約隊):
self.books = {
"book1": {
"title": "Python Cookbook",
"borrowed_by": None, # "alice" → None
"borrowed_at": None, # 2 → None
"expires_at": None,
"borrow_count": 1, # 唔變
"reservations": [],
"history": [
{"ts": 1, "action": "add"},
{"ts": 2, "action": "borrow", "user": "alice"},
{"ts": 5, "action": "return", "user": "alice"}
]
}
}
return_book 有預約隊嘅情景:
reservations = ["bob", "charlie"]
return_book(5, "book1") 之後:
1. alice 還書 → borrowed_by = None
2. pop(0) → "bob"
3. borrowed_by = "bob"(自動借畀 bob)
4. reservations = ["charlie"](bob 走咗)
結果:書一還就即刻畀下一個人借走
get_book(6, "book1") return:
return 成個 dict:
{
"title": "Python Cookbook",
"borrowed_by": None,
"borrowed_at": None,
...
}
get_book(6, "book99") → None(唔存在)
list_books:排序(by title 或 by borrows)。search_books:prefix 搜尋。Format:book_id(title)。
def list_books(self, timestamp, sort_by): # 列出所有書(排序)
self._process_overdue(timestamp) # L3 加:先處理過期嘅書
items = list(self.books.items()) # [(book_id, book_dict), ...]
if sort_by == "title": # 按書名 A-Z 排
items.sort(key=lambda x: x[1]["title"]) # x[1] = book dict,攞 title
elif sort_by == "borrows": # 按借出次數排(多 → 少)
items.sort(key=lambda x: (-x[1]["borrow_count"], x[1]["title"])) # 次數 desc,同分按 title asc
# Format: "book_id(title)"
return [f"{bid}({book['title']})" for bid, book in items] # 格式化輸出
def search_books(self, timestamp, prefix): # prefix 搜尋書名
self._process_overdue(timestamp) # L3 加:先處理過期嘅書
results = [] # 收集符合 prefix 嘅書
for bid, book in self.books.items(): # 行晒所有書
if book["title"].startswith(prefix): # 書名開頭係呢個 prefix?
results.append(f"{bid}({book['title']})") # 格式化加入
return results # return 所有 match 嘅書
list_books 排序邏輯:
sort_by = "title":
按 title 字母順序 A-Z
sort_by = "borrows":
按 borrow_count 由大到小
如果 borrow_count 一樣 → 再按 title A-Z
lambda x: (-x[1]["borrow_count"], x[1]["title"])
-borrow_count → 大嘅排前面
title → 同分按字母排
例子:
books = {
"b1": {"title": "Alpha", "borrow_count": 3},
"b2": {"title": "Beta", "borrow_count": 5},
"b3": {"title": "Gamma", "borrow_count": 3}
}
list_books(ts, "borrows") →
["b2(Beta)", "b1(Alpha)", "b3(Gamma)"]
Beta(5) > Alpha(3) = Gamma(3), tie → A < G
search_books prefix:
books = {
"b1": {"title": "Python Cookbook"},
"b2": {"title": "Python Crash Course"},
"b3": {"title": "JavaScript Guide"}
}
search_books(ts, "Python") →
["b1(Python Cookbook)", "b2(Python Crash Course)"]
search_books(ts, "Java") →
["b3(JavaScript Guide)"]
search_books(ts, "Rust") →
[] (冇 match)
borrow_with_due:借書但有到期時間。_process_overdue 到期自動還。get_overdue_fee:計罰款。
def borrow_with_due(self, timestamp, book_id, user_id, due_ms): # 有期限借書
self._process_overdue(timestamp) # 先處理過期嘅書
if book_id not in self.books: # 書唔存在?
return False
book = self.books[book_id] # 攞出嚟
if book["borrowed_by"] is not None: # 已經有人借咗?
return False # 唔畀再借
book["borrowed_by"] = user_id # 記低邊個借
book["borrowed_at"] = timestamp # 記低幾時借
book["expires_at"] = timestamp + due_ms # 到期時間 = 而家 + 期限
book["borrow_count"] += 1 # 借出次數 +1
book["history"].append({"ts": timestamp, "action": "borrow", "user": user_id}) # L4 加
return book["expires_at"] # return 到期時間
def get_overdue_fee(self, timestamp, book_id): # 計過期罰款
self._process_overdue(timestamp) # 先處理過期嘅書
if book_id not in self.books: # 書唔存在?
return 0
book = self.books[book_id] # 攞出嚟
# 只有正在被借 + 有期限 + 已過期 先計罰款
if book["borrowed_by"] is None: # 冇人借?
return 0 # 書架上 → 冇罰款
if book["expires_at"] is None: # 冇期限?
return 0 # 永久借 → 冇罰款
if timestamp <= book["expires_at"]: # 未過期?
return 0 # 仲有時間 → 冇罰款
# 過期幾耐 = 而家 - 到期時間
overdue_time = timestamp - book["expires_at"] # 遲咗幾多 ms
fee = overdue_time # 1ms = 1 unit fee(可按題目調整比率)
return fee # return 罰款金額
borrow_with_due 嘅時間計算:
borrow_with_due(1000, "book1", "alice", 5000)
expires_at = 1000 + 5000 = 6000
return 6000(到期時間)
如果 timestamp=5999 call _process_overdue:
5999 < 6000 → 未過期,唔做嘢
如果 timestamp=6001 call _process_overdue:
6001 >= 6000 → 過期!自動還書
get_overdue_fee 計算:
book1 被借咗,expires_at = 6000
get_overdue_fee(6500, "book1")
timestamp(6500) > expires_at(6000) → 過期
fee = 6500 - 6000 = 500
get_overdue_fee(5500, "book1")
timestamp(5500) <= expires_at(6000) → 未過期
fee = 0
注意:_process_overdue 會自動還過期嘅書
所以如果 timestamp > expires_at:
_process_overdue 先跑 → 書已被自動還
然後 borrowed_by = None → return 0
結論:get_overdue_fee 只能喺書仲被借住
+ 未到 _process_overdue 觸發時先有效
如果想查已過期嘅罰款,要喺 _process_overdue 之前攔截
reserve_book:書被借走時可以排隊預約,還書時自動借畀下一個。get_history:查事件歷史。
def reserve_book(self, timestamp, book_id, user_id): # 預約書
self._process_overdue(timestamp) # 先處理過期嘅書
if book_id not in self.books: # 書唔存在?
return False
book = self.books[book_id] # 攞出嚟
if book["borrowed_by"] is None: # 書喺架上?
return False # 書而家冇人借,直接借啦,唔使預約
if user_id in book["reservations"]: # 已經排咗隊?
return False # 唔好重複排
book["reservations"].append(user_id) # 加入預約隊尾
book["history"].append({"ts": timestamp, "action": "reserve", "user": user_id}) # 記低
return True # 預約成功
def get_history(self, timestamp, book_id): # 查書嘅事件歷史
self._process_overdue(timestamp) # 先處理過期嘅書
if book_id not in self.books: # 書唔存在?
return [] # 空 list
return self.books[book_id]["history"] # return 歷史 list
reserve_book 預約流程:
前提:book1 已被 alice 借走
reserve_book(10, "book1", "bob") → True
reservations = ["bob"]
reserve_book(11, "book1", "charlie") → True
reservations = ["bob", "charlie"]
reserve_book(12, "book1", "bob") → False
bob 已經排咗隊,唔好重複
還書時自動觸發(喺 return_book 入面):
return_book(20, "book1") → alice 還書
1. reservations.pop(0) → "bob"
2. borrowed_by = "bob"
3. reservations = ["charlie"]
4. bob 唔使主動 borrow,自動攞到
get_history return 例子:
get_history(99, "book1") →
[
{"ts": 1, "action": "add"},
{"ts": 2, "action": "borrow", "user": "alice"},
{"ts": 10, "action": "reserve", "user": "bob"},
{"ts": 20, "action": "return", "user": "alice"},
{"ts": 20, "action": "borrow", "user": "bob"}
]
每個 event 都有 ts + action
borrow/return/reserve 有 user field
同時處理多個 operations。Lock per book_id 防止同一本書 race condition。Gather 一次過跑。
# 第一步:先定義點樣處理單一 op
# 第二步:按操作涉及嘅 book_id 去攞鎖
# 第三步:收集晒 coroutine,再一次過 gather
async def batch_operations(self, timestamp, ops): # 批量處理一堆圖書館動作
self._process_overdue(timestamp) # 開工前先補返到期書
async def execute_op(op): # 每次只處理一張單
bid = op["book_id"] # 呢張單操作邊本書
async with self.book_locks[bid]: # 同一本書一次只畀一張單改
if op["type"] == "borrow": # 借書
return self.borrow_book(timestamp, bid, op["user_id"])
elif op["type"] == "return": # 還書
return self.return_book(timestamp, bid)
elif op["type"] == "reserve": # 預約
return self.reserve_book(timestamp, bid, op["user_id"])
elif op["type"] == "add": # 加書
return self.add_book(timestamp, bid, op["title"])
# 收集所有 coroutine,一次過跑
tasks = [] # 待辦 list
for op in ops: # 行晒每張單
tasks.append(execute_op(op)) # 包成 coroutine
results = await asyncio.gather(*tasks) # 全部同時跑,各自靠 lock 保護
return list(results) # return 每張單嘅結果
L5 嘅 __init__(加 book_locks):
def __init__(self):
self.books = {}
self.book_locks = defaultdict(asyncio.Lock) # L5 加
Lock per book_id:
每本書一把鎖
同一本書嘅操作會排隊
唔同書嘅操作可以同時跑
例如:
op1: borrow book1
op2: return book1
op3: borrow book2
op1 同 op2 要排隊(同一本書)
op3 可以同 op1/op2 同時跑(唔同書)
batch_operations 用法:
ops = [
{"type": "borrow", "book_id": "b1", "user_id": "alice"},
{"type": "return", "book_id": "b2"},
{"type": "add", "book_id": "b3", "title": "New Book"}
]
results = await batch_operations(100, ops)
results = [True, True, True] # 每張單各自嘅結果
同 L5 一樣但加 Semaphore 限制同時幾多個。Fail-fast:書唔存在 or 未被借 → 即 False 唔 sleep。Lock 包住改 data,Sem 包住外部 call(sleep)。
# 第一步:先喺本地鎖住做 fail-fast 檢查
# 第二步:真係過到關先離開 lock
# 第三步:過關嘅單先入 semaphore,模擬外部 API 慢慢處理
async def sync_library(self, timestamp, transfers, max_concurrent): # 批量同步轉移(書由一個館搬去另一個館)
self._process_overdue(timestamp) # 開工前先補返到期書
sem = asyncio.Semaphore(max_concurrent) # 限制同時幾多個外部 call
async def do_transfer(t): # 每次處理一本書嘅轉移
bid = t["book_id"] # 要轉移邊本書
lock = self.book_locks[bid] # 呢本書嘅鎖
# 第一步:fail-fast 檢查
async with lock: # 改書之前先鎖住
if bid not in self.books: # 書唔存在?
return False # 即刻作廢,唔去搶外部 API
if self.books[bid]["borrowed_by"] is None: # 書冇人借住?
return False # 要被借住先可以轉移
# 第二步:過到關,正式處理
self.books[bid]["borrowed_by"] = None # 清除借出狀態(書被轉走)
self.books[bid]["borrowed_at"] = None # 清除借出時間
self.books[bid]["expires_at"] = None # 清除到期
self.books[bid]["history"].append({"ts": timestamp, "action": "transfer"}) # 記低歷史
# 第三步:只得成功嘅單先入 semaphore 模擬外部同步
async with sem: # 搶外部 API 窗口
await asyncio.sleep(0.01) # 模擬同步到外部系統要等一陣
return True # 轉移成功
tasks = [] # 收集每本書嘅轉移 coroutine
for t in transfers: # 行晒成批 transfers
tasks.append(do_transfer(t)) # 包成 coroutine
results = await asyncio.gather(*tasks) # 全部同時跑
return list(results) # return 每本書成功定失敗
L6 嘅 __init__(同 L5 一樣):
def __init__(self):
self.books = {}
self.book_locks = defaultdict(asyncio.Lock)
Fail-fast 邏輯:
唔合格嘅單即刻 return False:
1. 書唔存在 → False(唔搶 sem)
2. 書冇人借住 → False(唔搶 sem)
合格嘅單先入 semaphore:
3. 書存在 + 有人借住 → 清除狀態 → 入 sem → sleep → True
點解要 fail-fast?
因為 semaphore 有限(max_concurrent)
唔好畀失敗嘅單白白佔住 sem 名額
L5 vs L6 分別:
L5:gather + Lock
全部同時跑,鎖住 book_id 防 race
L6:gather + Lock + Semaphore
Lock 包住改 data(本地操作)
Sem 包住外部 call(限流)
分開用!唔好 nested
Lock vs Semaphore���
Lock = 一把鎖,一次只畀一個 task 入
Semaphore(3) = 三個窗口,同時最多 3 個 task 入
sync_library 用法:
transfers = [
{"book_id": "b1"},
{"book_id": "b2"},
{"book_id": "b3"}
]
results = await sync_library(100, transfers, max_concurrent=2)
→ 最多同時 2 本書做外部同步
→ results = [True, False, True]
以下係成個 Library System 嘅完整 Python script,可以直接 copy 跑。
import asyncio
from collections import defaultdict
class LibrarySystem:
def __init__(self):
self.books = {} # book_id → {title, borrowed_by, borrowed_at, expires_at, borrow_count, reservations, history}
self.book_locks = defaultdict(asyncio.Lock) # L5:每本書一把鎖
# ─── Helper: Lazy overdue processing ───────────────────
def _process_overdue(self, timestamp): # 行晒所有書,過期嘅自動還
for book_id, book in self.books.items():
if (book["borrowed_by"] is not None
and book["expires_at"] is not None
and timestamp >= book["expires_at"]): # 有人借 + 有期限 + 過期
book["borrowed_by"] = None # 自動還
book["borrowed_at"] = None
book["expires_at"] = None
if book["reservations"]: # 有人排隊?自動借畀下一位
next_user = book["reservations"].pop(0)
book["borrowed_by"] = next_user
book["borrowed_at"] = timestamp
book["borrow_count"] += 1
book["history"].append({"ts": timestamp, "action": "borrow", "user": next_user})
# ─── L1: CRUD ─────────────────────────────────────────
def add_book(self, timestamp, book_id, title): # 加書入館
self._process_overdue(timestamp)
if book_id in self.books:
return False
self.books[book_id] = {
"title": title, "borrowed_by": None, "borrowed_at": None,
"expires_at": None, "borrow_count": 0, "reservations": [],
"history": [{"ts": timestamp, "action": "add"}]
}
return True
def borrow_book(self, timestamp, book_id, user_id): # 借書
self._process_overdue(timestamp)
if book_id not in self.books:
return False
book = self.books[book_id]
if book["borrowed_by"] is not None:
return False
book["borrowed_by"] = user_id
book["borrowed_at"] = timestamp
book["borrow_count"] += 1
book["history"].append({"ts": timestamp, "action": "borrow", "user": user_id})
return True
def return_book(self, timestamp, book_id): # 還書
self._process_overdue(timestamp)
if book_id not in self.books:
return False
book = self.books[book_id]
if book["borrowed_by"] is None:
return False
book["history"].append({"ts": timestamp, "action": "return", "user": book["borrowed_by"]})
book["borrowed_by"] = None
book["borrowed_at"] = None
book["expires_at"] = None
if book["reservations"]: # 有人排隊?自動借畀下一位
next_user = book["reservations"].pop(0)
book["borrowed_by"] = next_user
book["borrowed_at"] = timestamp
book["borrow_count"] += 1
book["history"].append({"ts": timestamp, "action": "borrow", "user": next_user})
return True
def get_book(self, timestamp, book_id): # 查書
self._process_overdue(timestamp)
if book_id not in self.books:
return None
return self.books[book_id]
# ─── L2: Sort / Search ────────────────────────────────
def list_books(self, timestamp, sort_by): # 列書(排序)
self._process_overdue(timestamp)
items = list(self.books.items())
if sort_by == "title":
items.sort(key=lambda x: x[1]["title"])
elif sort_by == "borrows":
items.sort(key=lambda x: (-x[1]["borrow_count"], x[1]["title"]))
return [f"{bid}({book['title']})" for bid, book in items]
def search_books(self, timestamp, prefix): # prefix 搜尋
self._process_overdue(timestamp)
results = []
for bid, book in self.books.items():
if book["title"].startswith(prefix):
results.append(f"{bid}({book['title']})")
return results
# ─── L3: TTL / Overdue Fee ─────────────────────────────
def borrow_with_due(self, timestamp, book_id, user_id, due_ms):
self._process_overdue(timestamp)
if book_id not in self.books:
return False
book = self.books[book_id]
if book["borrowed_by"] is not None:
return False
book["borrowed_by"] = user_id
book["borrowed_at"] = timestamp
book["expires_at"] = timestamp + due_ms
book["borrow_count"] += 1
book["history"].append({"ts": timestamp, "action": "borrow", "user": user_id})
return book["expires_at"]
def get_overdue_fee(self, timestamp, book_id):
self._process_overdue(timestamp)
if book_id not in self.books:
return 0
book = self.books[book_id]
if book["borrowed_by"] is None:
return 0
if book["expires_at"] is None:
return 0
if timestamp <= book["expires_at"]:
return 0
return timestamp - book["expires_at"]
# ─── L4: Reservation + History ─────────────────────────
def reserve_book(self, timestamp, book_id, user_id):
self._process_overdue(timestamp)
if book_id not in self.books:
return False
book = self.books[book_id]
if book["borrowed_by"] is None:
return False
if user_id in book["reservations"]:
return False
book["reservations"].append(user_id)
book["history"].append({"ts": timestamp, "action": "reserve", "user": user_id})
return True
def get_history(self, timestamp, book_id):
self._process_overdue(timestamp)
if book_id not in self.books:
return []
return self.books[book_id]["history"]
# ─── L5: Concurrent Batch ─────────────────────────────
async def batch_operations(self, timestamp, ops):
self._process_overdue(timestamp)
async def execute_op(op):
bid = op["book_id"]
async with self.book_locks[bid]:
if op["type"] == "borrow":
return self.borrow_book(timestamp, bid, op["user_id"])
elif op["type"] == "return":
return self.return_book(timestamp, bid)
elif op["type"] == "reserve":
return self.reserve_book(timestamp, bid, op["user_id"])
elif op["type"] == "add":
return self.add_book(timestamp, bid, op["title"])
tasks = [execute_op(op) for op in ops]
results = await asyncio.gather(*tasks)
return list(results)
# ─── L6: Rate Limited Sync ─────────────────────────────
async def sync_library(self, timestamp, transfers, max_concurrent):
self._process_overdue(timestamp)
sem = asyncio.Semaphore(max_concurrent)
async def do_transfer(t):
bid = t["book_id"]
lock = self.book_locks[bid]
async with lock:
if bid not in self.books:
return False
if self.books[bid]["borrowed_by"] is None:
return False
self.books[bid]["borrowed_by"] = None
self.books[bid]["borrowed_at"] = None
self.books[bid]["expires_at"] = None
self.books[bid]["history"].append({"ts": timestamp, "action": "transfer"})
async with sem:
await asyncio.sleep(0.01)
return True
tasks = [do_transfer(t) for t in transfers]
results = await asyncio.gather(*tasks)
return list(results)
# ─── 測試 ──────────────────────────────────────────────────
async def main():
lib = LibrarySystem()
# L1: CRUD
assert lib.add_book(1, "b1", "Python Cookbook") == True
assert lib.add_book(2, "b2", "JavaScript Guide") == True
assert lib.add_book(3, "b1", "Duplicate") == False
assert lib.borrow_book(4, "b1", "alice") == True
assert lib.borrow_book(5, "b1", "bob") == False
assert lib.return_book(6, "b1") == True
assert lib.return_book(7, "b1") == False
assert lib.get_book(8, "b1")["title"] == "Python Cookbook"
assert lib.get_book(9, "b99") is None
# L2: Sort / Search
lib.borrow_book(10, "b1", "alice")
lib.return_book(11, "b1")
lib.borrow_book(12, "b1", "bob")
lib.return_book(13, "b1")
assert lib.list_books(14, "title") == ["b2(JavaScript Guide)", "b1(Python Cookbook)"]
assert lib.list_books(15, "borrows") == ["b1(Python Cookbook)", "b2(JavaScript Guide)"]
assert lib.search_books(16, "Python") == ["b1(Python Cookbook)"]
assert lib.search_books(17, "Rust") == []
# L3: TTL
assert lib.borrow_with_due(100, "b2", "charlie", 50) == 150
assert lib.get_overdue_fee(140, "b2") == 0
book2 = lib.get_book(160, "b2")
assert book2["borrowed_by"] is None
# L4: Reservation
lib.borrow_book(200, "b1", "alice")
assert lib.reserve_book(201, "b1", "bob") == True
assert lib.reserve_book(202, "b1", "charlie") == True
lib.return_book(210, "b1")
assert lib.get_book(211, "b1")["borrowed_by"] == "bob"
history = lib.get_history(212, "b1")
assert any(e["action"] == "reserve" for e in history)
# L5: Batch
lib.add_book(300, "b3", "Go Handbook")
results = await lib.batch_operations(301, [
{"type": "borrow", "book_id": "b3", "user_id": "dave"},
{"type": "borrow", "book_id": "b2", "user_id": "eve"},
])
assert results == [True, True]
# L6: Sync
results = await lib.sync_library(400, [
{"book_id": "b3"},
{"book_id": "b2"},
{"book_id": "b99"},
], max_concurrent=2)
assert results == [True, True, False]
print("All tests passed!")
asyncio.run(main())
想像你寫一個 Playlist system mock。每首歌有 song_id、title、duration。要寫個 class 模擬建歌單、加歌、移歌、排序、播放記錄、隨機播、async batch。
想像一個歌單 app:
┌──��───────────────────────────────────────────────────┐
│ playlist_id="pl1" name="My Mix" │
│ │
│ songs: │
│ song_id="s1" title="Song A" duration=180 plays=3 │
│ song_id="s2" title="Song B" duration=240 plays=0 │
│ song_id="s3" title="Song C" duration=120 plays=1 │
│ │
│ play_history: [(ts=1000, "s1"), (ts=2000, "s3")] │
└──────���─────────────────────────────���─────────────────┘
每個 playlist 有:
playlist_id = 歌單 ID(unique key)
name = 歌單名
songs = list of song dicts
每首歌有:
song_id = 歌 ID(同一歌單入面 unique)
title = 歌名
duration = 幾長(秒)
play_count = 播咗幾多次
added_order = 加入嘅順序(1-based)
規則:
1. playlist_id 唔可以重複(create reject 重複)
2. 同一歌單入面 song_id 唔可以重複
3. play_song 記錄每次播放 + 更新 play_count
4. TTL 過期嘅歌會從歌單移除(lazy purge)
# 例:上面個歌單查一啲嘢
get_playlist(t, "pl1")
→ {"name": "My Mix", "songs": [...]}
list_songs(t, "pl1", "order")
→ "s1(Song A), s2(Song B), s3(Song C)"
list_songs(t, "pl1", "duration")
→ "s2(Song B), s1(Song A), s3(Song C)"
total_duration(t, "pl1") → 540
get_most_played(t, "pl1", 2)
→ [{"song_id": "s1", "play_count": 3}, ...]
# 後面 level 加多啲嘢:
# L2 加 sort(list_songs)+ total + search
# L3 加 play_song + most_played + TTL expire
# L4 加 shuffle + queue_next + history + merge
# L5 加 async batch_operations(per-playlist lock)
# L6 加 sync_playlists(rate-limited,semaphore)
import asyncio
import random
from collections import defaultdict
class PlaylistSystem:
def __init__(self):
self.playlists = {} # L1 所有歌單(playlist_id → info dict)
self.play_history = defaultdict(list) # L3 播放記錄(playlist_id → list of events���
self.locks = defaultdict(asyncio.Lock) # L5 加:per-playlist 嘅 async lock
self.playlists = {
"pl1": {
"name": "My Mix",
"songs": [
{"song_id": "s1", "title": "Song A", "duration": 180, "play_count": 0, "added_order": 1},
{"song_id": "s2", "title": "Song B", "duration": 240, "play_count": 0, "added_order": 2},
]
}
}
# 第一層 key = playlist_id("pl1")
# 第二層 "songs" 係個 list,每首歌一個 dict
# added_order = 加入順序(1-based,後加嘅數字大)
self.play_history = {
"pl1": [
{"timestamp": 1000, "song_id": "s1"},
{"timestamp": 2000, "song_id": "s3"},
]
}
# defaultdict(list):一 access 就自動開個空 list
# 每次 play_song 就 append 一個 event
song_id │ title │ duration │ play_count │ added_order │ expires_at
──────────┼──────────┼──────────┼────────────┼─────────────┼────────────
s1 │ Song A │ 180 │ 3 │ 1 │ None
s2 │ Song B │ 240 │ 0 │ 2 │ None
s3 │ Song C │ 120 ��� 1 │ 3 │ 5000
L1:song_id, title, duration, play_count, added_order # 基本
L2:(冇加新 field,只係讀 duration + title + added_order)
L3:expires_at(song level)、play_history(playlist level)
L4:(冇加新 field,shuffle 改 added_order,queue 用 play_count)
L5:self.locks(init 時加 defaultdict(asyncio.Lock))
L6:(冇加新 field,semaphore 喺 method 入面開)
# Helper: _process_expiring_songs — lazy TTL 到鐘就幫歌移除歌單(每個 public method 開頭都 call)
def _process_expiring_songs(self, timestamp): # 唔係定時 task,係 lazy 模式
for pid, pl in self.playlists.items(): # 逐個歌單睇
remaining = [] # 暫存未過期嘅歌
for song in pl["songs"]: # 逐首歌睇
exp = song.get("expires_at") # 攞 expires_at(可能冇���
if exp is not None and timestamp >= exp: # 到鐘就移��
continue # 唔入 remaining = 等於刪除
remaining.append(song) # 未過期 → 保��
pl["songs"] = remaining # 用新 list 覆蓋舊嘅(過期嘅消失)
_process_expiring_songs(timestamp)
行一次 self.playlists 入面所有歌單嘅 songs
凡係 expires_at 不為 None 且 timestamp >= expires_at
就唔保留(等於從歌單移除)
注意:係 del 首歌!(唔似 Gym 只係改 field)
每個 public method 第一行都 call 一次(lazy 模式)
Gym 嘅 TTL 過期 → 只係 check out(member 仲喺 system 入面)
Playlist ��� TTL 過期 → del 走首歌(歌消失咗)
所以 Playlist 要用 remaining = [] 收集再覆蓋
唔可以 iterate 緊 list 嘅時候 remove(會 skip 元素)
同 FS/Bank 嘅 purge pattern 一樣:收集 → 重建
create_playlist = 建歌單 add_song = 加歌 remove_song = 移歌 get_playlist = 查歌單
def create_playlist(self, timestamp, playlist_id, name): # 建新歌單
self._process_expiring_songs(timestamp) # 開頭先清過期(公定模式)
if playlist_id in self.playlists: # 重複 playlist_id → 拒收
return False # 約定 return False
self.playlists[playlist_id] = { # 開一格新歌單
"name": name, # 記低名
"songs": [], # 空 list(未有歌)
}
return True # 建立成功
def add_song(self, timestamp, playlist_id, song_id, title, duration_sec): # 加歌入歌單
self._process_expiring_songs(timestamp) # ���頭先清過期
if playlist_id not in self.playlists: # 歌單唔存在
return False # 冇呢個歌單
pl = self.playlists[playlist_id] # 攞歌單
for song in pl["songs"]: # 檢查有冇重複 song_id
if song["song_id"] == song_id: # 已經有呢首歌
return False # 唔畀重複加
order = len(pl["songs"]) + 1 # 計算 added_order(1-based)
pl["songs"].append({ # append 去 songs list
"song_id": song_id, # �� ID
"title": title, # 歌名
"duration": duration_sec, # 幾長(秒)
"play_count": 0, # 播放次數由 0 開始
"added_order": order, # 加入順序
})
return True # 加���成功
def remove_song(self, timestamp, playlist_id, song_id): # 從歌單移除歌
self._process_expiring_songs(timestamp) # 開頭先清過���
if playlist_id not in self.playlists: # 歌單唔存在
return False # 查無此歌單
pl = self.playlists[playlist_id] # 攞歌單
for i, song in enumerate(pl["songs"]): # 搵首歌
if song["song_id"] == song_id: # 搵到
pl["songs"].pop(i) # 移除(pop by index)
return True # 移��成功
return False # 歌唔存在
def get_playlist(self, timestamp, playlist_id): # 查歌單資料
self._process_expiring_songs(timestamp) # 開頭��清過期
if playlist_id not in self.playlists: # 唔存在
return None # 查無此歌單
return self.playlists[playlist_id] # 返成個 dict(含 name + songs)
self.playlists = {
"pl1": {
"name": "My Mix",
"songs": [
{"song_id": "s1", "title": "Song A", "duration": 180,
"play_count": 0, "added_order": 1},
]
}
}
# songs 係 list 唔係 dict
# 因為 order matters(insertion order = added_order)
# 搵歌要 for loop(O(n)),但歌單通常唔大
_process_expiring_songs(timestamp)
L1 入面所有 method 第一行都 call
L1 自己唔會產生 TTL(add_song 唔 set expires_at)
但係要養成習慣,方便 L3 一加 TTL 就有效
list.remove(song) 要成個 dict 做 equality check
pop(i) 直接用 index 移除,快靚正
搵到就 return True 唔使行晒成個 list
list_songs = 列歌 sort_by = "order"(加入順序)或 "duration"(長度 desc, tie=title asc) total_duration = 總長度 search_songs = prefix search
def list_songs(self, timestamp, playlist_id, sort_by): # 列出歌單入面嘅歌
self._process_expiring_songs(timestamp) # 開頭先清過期
if playlist_id not in self.playlists: # 歌單唔存在
return "" # 返空 string
songs = self.playlists[playlist_id]["songs"] # 攞 songs list
if not songs: # 冇歌
return "" # 返空 string
if sort_by == "order": # 按加入順序排
sorted_songs = sorted(songs, key=lambda s: s["added_order"]) # 細 → 大
elif sort_by == "duration": # 按長度排(大 → 細,同長用 title asc)
sorted_songs = sorted(songs, key=lambda s: (-s["duration"], s["title"])) # duration desc, title asc
else: # 其他 sort_by 唔 support
sorted_songs = songs # 原樣返
parts = [] # 暫存格式化結果
for s in sorted_songs: # 逐首歌格式化
parts.append(f"{s['song_id']}({s['title']})") # "s1(Song A)" 格式
return ", ".join(parts) # 逗號分隔返出去
def total_duration(self, timestamp, playlist_id): # 歌單總長度(秒)
self._process_expiring_songs(timestamp) # 開頭先清過���
if playlist_id not in self.playlists: # 歌單唔���在
return 0 # 返 0
songs = self.playlists[playlist_id]["songs"] # 攞 songs list
total = 0 # 加埋所有 duration
for s in songs: # 逐首加
total += s["duration"] # 累加
return total # 返總秒數
def search_songs(self, timestamp, playlist_id, prefix): # 搜尋歌名 prefix
self._process_expiring_songs(timestamp) # 開頭先清過期
if playlist_id not in self.playlists: # 歌單唔存在
return [] # 返空 list
songs = self.playlists[playlist_id]["songs"] # 攞 songs list
result = [] # 暫存 match 到嘅歌
for s in songs: # 逐��歌檢查
if s["title"].startswith(prefix): # title 以 prefix 開頭
result.append(s) # 入 result
return result # 返 list of song dicts
sort_by == "order":
按 added_order 升序(1, 2, 3...)
即係「邊首先加就排前面」
sort_by == "duration":
第一排序 key = -duration(大嘅排前面)
第二排序 key = title(同長就按名字 A→Z)
呢個係 "desc + tie-break asc" 嘅經典寫法
格式化:song_id(title)
例:"s1(Song A), s2(Song B)"
prefix = "Song" → match "Song A", "Song B", "Song C"
prefix = "So" → match 同上
prefix = "B" → 唔 match(title 係 "Song B" 唔係 "B...")
prefix = "" → match 全部(startswith("") 永遠 True)
返嘅係 song dict list,唔係格式化 string
可以寫:return sum(s["duration"] for s in songs)
但展開寫更易讀、更明確
面試唔會扣分用 sum(),兩者都 OK
play_song = 播歌(記錄 + play_count++) get_most_played = 最多播嘅 N 首 _process_expiring_songs = TTL 過期移除
def play_song(self, timestamp, playlist_id, song_id): # 播歌(記錄 event + 更新 play_count)
self._process_expiring_songs(timestamp) # 開頭先清過期
if playlist_id not in self.playlists: # 歌單唔存在
return False # 冇呢個歌單
pl = self.playlists[playlist_id] # 攞歌單
for song in pl["songs"]: # 搵首歌
if song["song_id"] == song_id: # 搵到
song["play_count"] += 1 # 播放次數 +1
self.play_history[playlist_id].append({ # 記錄播放 event
"timestamp": timestamp, # 幾時播
"song_id": song_id, # 播邊首
})
return True # 播��成功
return False # 首歌唔喺歌單入面
def get_most_played(self, timestamp, playlist_id, n): # 最多播嘅 N 首歌
self._process_expiring_songs(timestamp) # 開���先清過期
if playlist_id not in self.playlists: # 歌單唔存在
return [] # 返空 list
songs = self.playlists[playlist_id]["songs"] # 攞 songs list
sorted_songs = sorted(songs, key=lambda s: -s["play_count"]) # 播最多排前面
return sorted_songs[:n] # 只返前 N 首
1. song["play_count"] += 1 → 更新歌本身嘅計數
2. play_history[playlist_id].append({...}) → 記錄播放 event
兩者唔同:
play_count 係「呢首歌總共播咗幾多次」
play_history 係「呢個歌單嘅完整播放歷史」(含時間、順序)
get_most_played 用 play_count
get_play_history(L4)用 play_history
sorted(songs, key=lambda s: -s["play_count"])
-play_count = 播最多排前面(descending)
[:n] = 只要前 N 首
例:play_count = [3, 0, 1],n=2
sorted → [3, 1, 0]
[:2] → [3, 1]
add_song 加嘅歌預設冇 expires_at
如果 spec 要求歌有 TTL,可以用:
song["expires_at"] = timestamp + ttl_sec
_process_expiring_songs 會自動 purge 過期嘅
呢個係 lazy 模式:唔係 timer,係每次 call public method 先清
shuffle = 隨機播(重排 order) queue_next = 下一首未播 get_play_history = 播放記錄 merge = 合併兩個歌單
def shuffle_playlist(self, timestamp, playlist_id): # 隨機打亂歌單順序
self._process_expiring_songs(timestamp) # 開頭先清過���
if playlist_id not in self.playlists: # 歌單唔存在
return False # 冇呢��歌單
pl = self.playlists[playlist_id] # 攞歌單
random.shuffle(pl["songs"]) # in-place 打亂 songs list
for i, song in enumerate(pl["songs"]): # 重新分配 added_order
song["added_order"] = i + 1 # 1-based(新順序)
return True # shuffle 成功
def queue_next(self, timestamp, playlist_id): # 排住播:返下一首未播嘅歌
self._process_expiring_songs(timestamp) # 開頭先清過期
if playlist_id not in self.playlists: # 歌單唔存在
return None # 冇
songs = self.playlists[playlist_id]["songs"] # 攞 songs list
for song in songs: # 按 list 順序搵
if song["play_count"] == 0: # 未播過
return song # 返呢首(唔播,只係話你下一首係咩���
return None # 全部播晒
def get_play_history(self, timestamp, playlist_id): # 攞播放記錄
self._process_expiring_songs(timestamp) # 開頭先清過期
return self.play_history[playlist_id] # 返 list of {timestamp, song_id}(可能空)
def merge_playlists(self, timestamp, pl_id_1, pl_id_2): # 合併兩個歌單(pl_id_2 嘅歌加入 pl_id_1)
self._process_expiring_songs(timestamp) # 開頭先清過��
if pl_id_1 not in self.playlists: # 歌單 1 唔存在
return False # fail
if pl_id_2 not in self.playlists: # 歌單 2 唔存在
return False # fail
pl1 = self.playlists[pl_id_1] # 攞歌單 1
pl2 = self.playlists[pl_id_2] # 攞歌單 2
existing_ids = {s["song_id"] for s in pl1["songs"]} # pl1 已有嘅 song_id set
for song in pl2["songs"]: # 逐首 pl2 嘅歌
if song["song_id"] not in existing_ids: # 唔重複先加
new_order = len(pl1["songs"]) + 1 # 分配新 order
new_song = dict(song) # copy 一份(唔改原本 pl2 嘅)
new_song["added_order"] = new_order # 更新 order
pl1["songs"].append(new_song) # 加入 pl1
existing_ids.add(song["song_id"]) # 更新 set(防後面再重複)
return True # merge 成功
random.shuffle() 係 in-place 打亂 list
打亂之後 added_order 要重新 assign
否則 list_songs(sort_by="order") 會同 list 順序唔一致
例:原本 [s1(1), s2(2), s3(3)]
shuffle 後可能 [s3, s1, s2]
重新 assign → [s3(1), s1(2), s2(3)]
按 songs list 嘅順序(即 added_order 順序)逐首睇
搵第一首 play_count == 0 嘅歌
返成個 song dict(唔改任何嘢)
用途:UI 可以顯示「下一首會播」
真正播要另外 call play_song()
用 set 記住 pl1 已有嘅 song_id
pl2 嘅歌逐首檢查,唔重複先加
dict(song) 做 shallow copy — 唔改 pl2 原本嘅歌
加入後 existing_ids.add() 防止同一次 merge 入面重複
例:pl1 有 [s1, s2],pl2 有 [s2, s3]
s2 重複 → skip
s3 唔重複 → 加入 pl1,order = 3
defaultdict(list) 所以就算冇 play 過都返 []
返嘅係 reference(唔係 copy)
如果 spec 要求 immutable,可以 return list(self.play_history[...])
batch = 一拼做幾單嘢 lock = ��� per-playlist lock = 每個 playlist_id 一把鎖
async def batch_operations(self, timestamp, ops): # 一次過做一堆 create/add/remove/play
results = [] # 暫存每個 op 嘅結果
for op in ops: # 順住 input 順序逐個做
op_type = op["type"] # 攞 op 類型
pid = op["playlist_id"] # 攞 playlist_id
if op_type == "create": # create 類型
name = op["name"] # 攞 name
async with self.locks[pid]: # 鎖呢個 playlist_id
ok = self.create_playlist(timestamp, pid, name) # 走返 L1 嘅 create
results.append(ok) # 記返結果
elif op_type == "add_song": # add_song 類��
async with self.locks[pid]: # 鎖呢個 playlist_id
ok = self.add_song(timestamp, pid, op["song_id"], op["title"], op["duration"]) # 走返 L1
results.append(ok) # 記返結���
elif op_type == "remove_song": # remove_song 類型
async with self.locks[pid]: # 鎖呢個 playlist_id
ok = self.remove_song(timestamp, pid, op["song_id"]) # 走返 L1
results.append(ok) # 記返結果
elif op_type == "play": # play 類型
async with self.locks[pid]: # 鎖呢個 playlist_id
ok = self.play_song(timestamp, pid, op["song_id"]) # 走返 L3
results.append(ok) # 記返結果
else: # 其他 type 唔 support
results.append(False) # 一律 False
return results # 返一個同 input 一樣長嘅 list
def __init__(self):
self.playlists = {}
self.play_history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock)
self.locks = {
"pl1": <asyncio.Lock>, # defaultdict 一 access 就自動造
"pl2": <asyncio.Lock>,
}
# per-playlist lock(唔係 per-song)
# 兩個 op 操作唔同 playlist → 可以並行
# 兩�� op 操作同一個 playlist → 後嗰個會等
# 比 Gym 嘅 per-member lock 粒度粗啲
新加:
batch_operations(timestamp, ops) ← async
init 多咗:
self.locks = defaultdict(asyncio.Lock)
無改 L1/L2/L3/L4 嘅 sync method(batch 入面 call 返佢哋)
{"type": "create", "playlist_id": "pl1", "name": "Chill"}
{"type": "add_song", "playlist_id": "pl1", "song_id": "s1", "title": "X", "duration": 180}
{"type": "remove_song", "playlist_id": "pl1", "song_id": "s1"}
{"type": "play", "playlist_id": "pl1", "song_id": "s1"}
sync = 同步搬歌 semaphore = 信號燈(限制同時做嘅 transfer 數量) fail-fast = 歌單唔存在或空就即 fail,唔等 semaphore
async def sync_playlists(self, timestamp, transfers, max_concurrent): # 並行做一堆 transfer,限 N �� concurrent
self._process_expiring_songs(timestamp) # 開頭先���過期
sem = asyncio.Semaphore(max_concurrent) # 開一個 N 位嘅 semaphore(同時最多 N 個)
tasks = [] # 暫存所有 coroutine task
for transfer in transfers: # 逐個 transfer 包做一個 task
task = self._do_one_sync(timestamp, transfer, sem) # 起 coroutine(未 await)
tasks.append(task) # 入 list
results = await asyncio.gather(*tasks) # 並發跑,等全部完,保留順序
return list(results) # 轉做正常 list 返出去
async def _do_one_sync(self, timestamp, transfer, sem): # 做單一 sync transfer(async helper)
src_id = transfer["source_playlist_id"] # 攞 source playlist_id
dest_id = transfer["dest_playlist_id"] # 攞 destination playlist_id
# fail-fast:未攞 semaphore 之前已經 check(唔阻住其他 task)
if src_id not in self.playlists: # source 歌單���存在
return False # 即刻 False,唔 acquire semaphore
if dest_id not in self.playlists: # dest 歌單唔存在
return False # 即刻 False
src_pl = self.playlists[src_id] # 攞 source 歌單
if not src_pl["songs"]: # source 歌單冇歌
return False # 空歌單唔 sync
async with sem: # 過咗 fail-fast 先攞 semaphore(限速)
await asyncio.sleep(0.01) # 模擬 sync 嘅延遲(10ms)
dest_pl = self.playlists[dest_id] # 攞 dest 歌單
existing_ids = {s["song_id"] for s in dest_pl["songs"]} # dest 已有嘅 song_id
for song in src_pl["songs"]: # 逐首 source 嘅歌
if song["song_id"] not in existing_ids: # 唔重複先加
new_order = len(dest_pl["songs"]) + 1 # 分配新 order
new_song = dict(song) # copy 一份
new_song["added_order"] = new_order # 更新 order
new_song["play_count"] = 0 # play_count 歸零(新歌單重新計)
dest_pl["songs"].append(new_song) # 加入 dest
existing_ids.add(song["song_id"]) # 更新 set
return True # sync 成功
def __init__(self):
self.playlists = {}
self.play_history = defaultdict(list)
self.locks = defaultdict(asyncio.Lock)
# 同 L5 一樣,semaphore 喺 method 入面開(per-call)
1. source 歌單唔存在 → False
2. dest 歌單唔存在 → False
3. source ��單冇歌(空)→ False
呢三個 check 喺 acquire sem 之前做
唔合格即走,唔阻住其他 task 嘅 semaphore 位
呢個就係 fail-fast pattern
merge_playlists(L4)= sync method,一次做完
sync_playlists(L6)= async,有 semaphore 限速,有 sleep
核心邏輯一樣(copy 唔重複嘅歌去 dest)
但 sync 版本:
1. 會 sleep(模擬網絡延遲)
2. play_count 歸零(新歌單重新計)
3. 有 fail-fast(空歌單唔做)
{"source_playlist_id": "pl1", "dest_playlist_id": "pl2"}
# sync 係「從 source copy 歌去 dest」
# source 唔會被改(唔會清走歌)
# dest 會被加歌(唔重複嘅先加)
同 Gym、Bank、FS 一樣嘅 pattern:
check 喺 sem 外面做 → 合格先入 sem + sleep → 唔合格即走
asyncio.gather(*tasks) 保留 input 順序
list(results) 確保返嘅係 list 唔係 tuple