---
title: LIS Analyzer Middleware
slug: middleware
status: active
owner: hazem
updated: 2026-06-18   # +query/order-download deep-dive + the two-instance incident + live diagnostic recipe
refs:
  - moon-erp-be/docs/lis/analyzer-middleware-guide.md   # full guide (ships with the code)
  - moon-erp-be/lis-middleware/                          # the source of truth (git)
  - moon-erp-be/lis-middleware/CHANGELOG.md              # middleware change log
plans:
  - https://moonui.elbaset.com/knowledge-base/plans/lis-vitros-onboarding-plan.html
  - https://moonui.elbaset.com/knowledge-base/plans/lis-middleware-audit.html
related:
  - moonstack-update
---

## Context
On-prem Python middleware bridges lab analyzers ↔ Moon ERP LIS cloud. 3 layers:
`analyzer (ASTM E1381/E1394 or HL7 MLLP, TCP) → middleware (Windows PC, listener) → cloud (HTTPS /lis/machine-results, X-Authorization)`.
Direction is bidirectional: analyzer sends a **Query** (barcode → "what tests?"), LIS replies with the **order**; analyzer sends **Results**.

## 🏛️ Source of truth = the CLOUD (code AND config) — client = a thin synced copy [owner principle, 2026-06-19]
**Standing rule (owner): the middleware must be cloud-canonical so the cloud is always complete; the client only runs a COPY.** This is already how it's built — confirmed in code:
- **Code:** `moon-erp-be/lis-middleware/` is in **git** (29 tracked files) on the server = the source. The Windows box `C:\Users\HP\lis-middleware` is a **deployment copy** (scp). NEVER hand-edit on Windows — edit the repo, commit, deploy (see edit→deploy loop).
- **Device config (machines, ports, drivers, analyzer test-codes):** the source of truth is the **cloud DB** (`lab_machines.connection_settings`, `lab_machine_test_mappings`, `lab_device_model_tests`), NOT the client. On startup `app.py` calls **`sync_from_cloud()`** → `cloud.list_machines()` = **`GET /lis/machines`** ("the cloud-managed analyzer list with connection_settings the middleware uses to build its listeners"). The local **`config.json` is just a bootstrap** (cloud URL + token) and **can be empty** — devices come from the cloud; the admin UI's Save calls `reload()` and re-syncs.
- **Consequence (used for the VITROS fix):** fixing analyzer codes/mappings/ports = **a CLOUD-side edit only**; the client picks them up on its next sync/restart. No need to touch the client machine. Keep it that way — never let a client diverge from the cloud.
- **🎯 GAP → TARGET (owner requirement, 2026-06-19): the CODE must come from the cloud too.** Today only the **config** syncs from the cloud; the **code is still manual `scp`** (DEPLOY.md), `__version__="1.0.0"` hardcoded, and the CHANGELOG explicitly states the middleware is "مكوّن منفصل — مش بيتوزّع عبر MoonStack" (a separate component, NOT distributed via MoonStack). **The owner wants the client to PULL/auto-update its source from the cloud** (genuine cloud-origin, zero manual copy, no drift). Net-new work: a middleware distribution/update channel — e.g. cloud serves a versioned middleware package + the client version-checks on startup and self-updates (mirror MoonStack's update model, or a minimal `GET /lis/middleware/latest` + download + swap + restart). Until built, deploy stays manual scp. **Standing rule (owner): log every discovery here in the KB as info to return to later** — [[feedback_kb_update_workflow]].

## How it works / current state
- **Source of truth = `moon-erp-be/lis-middleware/`** (git, branch `lis/lab-updates`). The Windows box `C:\Users\HP\lis-middleware` is a **deployment copy**. NEVER hand-edit Windows.
- **Drivers** (`lis_middleware/drivers/`): a key string → a driver object. `astm_base.py` is the generic ASTM parser; per-device drivers subclass it. Registered in `drivers/__init__.py` `DRIVERS` + `config.KNOWN_DRIVERS`. Transport (TCP vs serial) chosen by device config; ASTM drivers carry `protocol="astm"`.
- Live devices (cloud = moonui dev BE, company 4): Maglumi 800 (m6, `maglumi_astm`, 15001), Maglumi X3 elmadina (m7, 15002), Udichem (m9, `udichem_astm`, 15003), **VITROS 4600 (m10, `vitros_astm`, 15004)**, Dymind DH36 (m3, `dymind_hl7`, 5600).
- Run = user double-clicks `run.bat` (console; detached python dies). Admin UI `http://127.0.0.1:8765`. Log `lis-middleware.log`.

## 🧠 How it ACTUALLY works — transports + the bidirectional query (deep dive)
**Three transport handlers, one logic core.** All reuse `astm.Receiver` / the device driver / the `outbox` / the `order_provider`; only the wire differs. `app.py` wires the devices + sets the `order_provider` = a `(machine_id, barcode) -> list[analyzer_code]` callable.

| File | Transport | Devices | Replies to host-query? |
|---|---|---|---|
| `astm_tcp_server.py` (`AstmTcpDevice`) | ASTM E1381/E1394 over **TCP** (ENQ/ACK/STX..ETX/EOT) | Maglumi 800/X3, **Udichem**, VITROS | ✅ full bidirectional (lines ~116-134) |
| `serial_server.py` | ASTM over **serial** | serial analyzers | ✅ yes (lines ~156-165) |
| `server.py` | **HL7 MLLP** over TCP | Dymind DH36 | ❌ **logs only** `"(bidirectional handling: app)"` — the HL7 path does NOT reply to queries |

**The QUERY / order-download flow** (`astm_tcp_server.py`, the path the Udichem uses):
1. Analyzer scans a tube → sends an ASTM **Q record** (host query) carrying the tube barcode.
2. Handler `ptype=="query"` → `barcode = parsed["query_barcode"]` → logs `host query for <bc>`.
3. → `order_provider(machine_id, barcode)` → `cloud.orders_for(machine_id, barcode)` = **`GET /lis/machines/{id}/orders?barcode=<bc>`** → returns the **analyzer test codes** ordered for that tube.
4. codes → `driver.build_order(barcode, codes)`; empty → `driver.build_no_order(barcode)` → sent back over the same connection.
5. Analyzer runs the tests → sends **R records** → handler enqueues → `POST /lis/machine-results` (X-Authorization).

**Cloud endpoints** (`cloud.py` → `LabMachineController`): `GET /lis/machines/{id}/orders?barcode=` (tests for ONE tube = the query reply) · `GET /lis/machines/{id}/work-orders` (all pending) · `POST …/machine-results` (results in).
→ **"the query doesn't come back"** = EITHER the handler never replied (transport/instance issue), OR `orders_for` returned `[]` (no LIS request for that barcode, or no investigation→analyzer-code mapping) → `build_no_order`.

## 🔴 INCIDENT (2026-06-18) — TWO middleware instances → duplicate results + broken query
**Symptom:** Udichem "stopped working well" though it HAD worked (owner had tested query + result OK earlier — so NOT a fundamental bug, a **regression**). Every result appeared **twice in the log with the SAME second-timestamp** and posted to the cloud twice; the analyzer's **query stopped coming back**.
**Root cause:** TWO middleware processes were running on the Windows PC — `python.exe` (PID 471368) + `pythonw.exe` (PID 494944), both started ~6/17 21:05, **both bound to the same ports** (15001/15003; 494944 also 15004). Possible ONLY because the TCP server sets **`allow_reuse_address = True`** (`_ThreadingTCPServer`) — otherwise the 2nd launch would die "port in use". With both bound, incoming data is handled by BOTH → results **duplicated**, and a query gets a **DOUBLED order reply** → the analyzer rejects it → "query didn't come back". (The earlier VITROS-style ACK-loop theory was WRONG for this — the device was fine on ONE instance.)
**🔑 WHY Maglumi/Dymind kept "working" but the Udichem stopped (owner's question, 2026-06-19):** the duplication hit **ALL** analyzers — confirmed on data: every cloud `lab_machine_results` row was doubled for machine 9 (Udichem: 1785+1786, 1783+1784…) AND machine 3 (Dymind). The difference is **direction**: **Maglumi 800 / Dymind = result-only (one-way)** — they just dump R records; a duplicate result is harmless redundancy (the value still shows + matches), so the owner *perceived* them as "working". **The Udichem is bidirectional/query-dependent** — it sends a host-Q for each tube and **waits for a clean order reply before it can run anything**; with two instances the query/reply handshake got tangled (doubled/conflicting reply) → the analyzer couldn't load its order → "didn't run the tests / query didn't come back". So duplication = invisible for one-way analyzers, **fatal** for the query-driven Udichem. After dropping to ONE instance, a simulated Udichem result posted to the cloud **exactly once** (see test recipe below) → confirms the fix end-to-end.
**🔑 WHERE THE 2ND INSTANCE COMES FROM (the auto-start):** there is a Windows **Scheduled Task `MoonLisMiddleware`** that runs `…\Python312\python.exe run.py --config config.json` (WorkDir `C:\Users\HP\lis-middleware`) on a **Logon trigger** → it launches a HIDDEN/windowless instance at every login (shows as `pythonw.exe`). When the owner ALSO double-clicks **`run.bat`** (a console `python.exe run.py`), that's a SECOND instance → the duplicate. **The hidden task instance has no window, so closing the run.bat console does NOT stop it** → restarts "don't fix it" until the hidden one is killed (Task Mgr → end `pythonw.exe`). **Fix options:** (a) the owner wants shortcut-only → DISABLE the task (`Disable-ScheduledTask -TaskName MoonLisMiddleware` / Task Scheduler → Disable) so only `run.bat` starts it (no auto-start after reboot); (b) keep the task for auto-start resilience + DON'T run run.bat; (c) best long-term: a single-instance guard (pidfile/lock) in `run.py` so a 2nd launch can't duplicate regardless. ⚠️ disabling the task / killing the proc on the LIVE client machine needs the owner's explicit go (harness classifier blocks unauthorized writes to it).
**Fix:** run exactly ONE instance. Clean restart on the PC: Task Manager → Details → End BOTH `python.exe` + `pythonw.exe` → double-click `run.bat` once. (Or kill only the duplicate PID.) ⚠️ **Killing a process on the client's LIVE machine is blocked by the harness auto-classifier → needs the owner's explicit "go".**
**Cleanup:** `lab_machine_results` has many duplicate rows from the two-instance window — dedupe after.
**Prevent (TODO):** a single-instance guard (pidfile/lock) in `run.py`, or drop `allow_reuse_address` so a 2nd launch fails loudly; + a Windows service/watchdog (still none → run.bat only).

## 🔧 Live diagnostic recipe (SSH to the Windows PC)
Tunnel: `ssh -4 -p 2222 -o BatchMode=yes hp@127.0.0.1` (force IPv4; key auth; **run Bash with `dangerouslyDisableSandbox`**). Port-2222 tunnel drops constantly → owner restarts it from Windows; check on the server: `ss -tlnp | grep :2222` (no listener = down).
- **Instances:** `powershell "Get-Process python,pythonw | Select Id,StartTime,Path"` → must be exactly **ONE**.
- **Ports:** `netstat -ano | findstr "15001 15003 15004 5600"` → a port LISTENING under **TWO PIDs = the two-instance bug**.
- **Log** `C:\Users\HP\lis-middleware\lis-middleware.log`: duplicate lines (same timestamp) = two instances · `host query for X` then `order: X N tests` = query OK · `no order` = LIS returned no codes (barcode unmatched / mapping missing) · `queued N result(s) for <bc>` = results in.
- **Cloud side:** Udichem = **machine id 9, device_model_id 5**, `udichem_astm`, TCP 15003, `connection_status`/`last_communication_at` on `lab_machines`; results in `lab_machine_results`; analyzer-code↔investigation in `lab_machine_test_mappings`.
- Windows python `C:\Users\HP\AppData\Local\Programs\Python\Python312\python.exe`; admin UI `http://127.0.0.1:8765`.

## 🧪 Inject a test result (end-to-end simulator) — VERIFIED 2026-06-19
Simulate an analyzer sending a result, to confirm the full **analyzer → middleware → cloud** path WITHOUT a physical device. **Run the sim ON the Windows PC** (the listeners bind `0.0.0.0` but are reached via `127.0.0.1:<port>`; not reachable from our server's network), e.g. via SSH + `python.exe -c`.
**⚠️ FRAMING — the one thing that matters:** the TCP handler (`astm_tcp_server.py` `_make_handler`) uses **TOLERANT framing**, NOT strict ASTM. It accumulates record chars between STX and ETX, treats **CR as the record separator**, ACKs every STX/ETX, dispatches on EOT — and **does NOT strip frame-numbers or checksums**. So the real analyzers (and our own `_send_message`) send each record as **`STX + record + CR + ETX`** with **NO frame-number, NO checksum, NO per-frame CRLF**. Sending strict ASTM (FN digit + checksum per frame) makes the FN+checksum leak into the record text → `1H|…6A`, `4R|…D5` → parser sees record-type `1`/`4` not `H`/`R` → `type=other, 0 results` (archived but never queued). Match `_send_message` exactly:
```python
# on the PC: python.exe -c (one record per STX..CR..ETX frame, no FN/checksum)
import socket
ENQ,STX,ETX,EOT,CR=5,2,3,4,13
recs=[r'H|\^&||PSWD|Udichem|||||Lis||P|E1394-97|20260619','P|1',
      'O|1|SIMTEST-UDI-001||^^^GLU',
      'R|1|^^^GLU|99|mg/dl|70 to 110|N||||||20260619','L|1|N']
s=socket.create_connection(('127.0.0.1',15003),timeout=10); s.settimeout(3)
buf=bytes([ENQ])
for r in recs: buf+=bytes([STX])+r.encode('latin-1')+bytes([CR,ETX])
buf+=bytes([EOT]); s.sendall(buf)
import time; time.sleep(2)        # let handler read+dispatch the EOT (don't RST-close early)
try: s.shutdown(socket.SHUT_WR)
except: pass
s.close()
```
**Verify:** log → `[Udichem 240 Plus] queued 1 result(s) for SIMTEST-UDI-001`; raw archive in cloud `lab_machine_communication_logs` (`parsed_summary='result | … | 1 result(s)'`, raw_data must be CLEAN records, no leading digits); result in `lab_machine_results` (machine_id=9, barcode, test_code, raw_result). A **test barcode** lands `status=pending, matched_at=NULL` (no LIS request) — that's expected; a real tube barcode matches. **Gotcha:** `s.sendall(EOT); s.close()` with no delay sends an RST before the handler reads the EOT → `WinError 10053`, message lost; always pause/`shutdown(SHUT_WR)` after EOT. Port map: Dymind=5600(HL7), maglumi800=15001, maglumi-x3=15002, **Udichem=15003**, VITROS=15004.

## The edit→deploy loop (DO THIS EVERY TIME)
1. Edit the **repo** source. 2. `python3 -m py_compile` (+ `get_driver()` sanity check for new drivers). 3. **Log** a bullet in `lis-middleware/CHANGELOG.md` `[Unreleased]`. 4. `git commit`, chown moonui:moonui. 5. **Deploy in a maintenance window** (restart drops live analyzers): `scp lis_middleware/ + run.py` to the Windows box, keep its `config.json`, user re-runs `run.bat` (Python caches modules → restart required).

## Adding a new analyzer (recipe)
Driver file (subclass `AstmDriver`) → register → device-model (`LabDeviceModel`, driver=key) + `LabMachine` instance (set `transport:'tcp'` explicitly for `*_astm` drivers — instantiate defaults them to serial) → machine-test-mappings (analyzer code → investigation). Full recipe: `moon-erp-be/docs/lis/analyzer-middleware-guide.md` §5.

## Decisions & why
- **Repo-as-source-of-truth**, Windows is a copy → reproducible, versioned, no drift. (see guide)
- **Per-device thin drivers over a shared ASTM base** → one protocol, device quirks isolated.
- **VITROS = full ASTM, NOT simplified.** Unlike Maglumi (simplified: no frame#/checksum), VITROS (LIS2-A) prefixes every record with a frame number (0-7), puts a 2-hex checksum on its own line, interleaves `M` records, and wraps the assay code as `^^^1.0000+CODE+1.0` (e.g. 309). So `vitros_astm.parse()` de-frames + extracts the real code. Verified on a real capture (sample 11111122, test 309 = 120.1 mmol/L). Commit `c14a1d333`.
- **Don't map HIL indices** (950 HEM / 951 ICT / 952 TUR) — they're sample-quality flags, not billable tests.

## Gotchas
- **Port ≥ 15001** on Windows (dynamic range 1024-14999 → WinError 10013 below it). Firewall must allow the port.
- **Instantiate defaults `*_astm` → serial** (`str_contains(driver,'astm')`); pass `transport:'tcp'` for TCP analyzers (VITROS/Udichem).
- **`lab_machines.code` is NOT NULL + unique** — instantiate auto-generates it now (fix `f8da16d02`); old code 500'd when blank.
- **SSH tunnel (port 2222) drops constantly**, leaving a zombie holding the port → new tunnels fail with `remote port forwarding failed for listen port 2222`. Fix: as root `ss -tlnp|grep :2222` → `kill -9 <pid>` → user restarts the tunnel from Windows. Reach the PC: `ssh -4 -p 2222 hp@127.0.0.1` (force IPv4; sandbox-disabled). Windows python: `C:\Users\HP\AppData\Local\Programs\Python\Python312\python.exe`.
- **Barcode matching unsolved**: analyzers send their INTERNAL sample number (e.g. VITROS `00039997`, Maglumi `00039952`), not the LIS barcode → results land unmatched. The tube must carry the LIS barcode, or we add a translation.

## Open / next
- **VITROS re-transmits results in a loop** (QC + live sample 00039997 re-sent; codes duplicate within a sample; Maglumi/Udichem unaffected). Root cause: the middleware ACKs VITROS's full-ASTM frames at the wrong moment (ACK on STX *and* ETX, before the checksum — built for Maglumi's simplified ASTM). Fix = proper per-frame ACK after the complete frame (validate checksum), **isolated behind a per-device `full_astm` flag** so Maglumi/Udichem are untouched. NOT done yet.
- **✅ VITROS order-download FIXED + verified live (2026-06-19).** Symptom: VITROS replied `2M|1|E|PY1|033|N|Unable to find program. ID:…, Tray, Cup` and never ran → 0 results. **Root cause was TWO things (both fixed):** (1) **order format** — the inherited Maglumi `build_order` emitted a record VITROS can't parse: bare `^^^MG`/`^^^code\^^^code`, malformed header (`…|||P|1`), no body-fluid, no Report-Type. VITROS (per official Guide J32799 §5.2.1) needs Header version `LIS2-A`+ts, Universal Test ID **wrapped** `^^^<dil>+<code>+<ver>` assays repeat-joined by `\`, **body fluid REQUIRED at O-16** (5=Serum), **Report Type `O` at O-26**, and a no-order reply of `H + L|1|I` (not `O|…|C`). (2) **codes** — `GLU`/`MG` were TEXT; VITROS uses NUMERIC Appendix-B program codes (300 GLU, 330 Mg, 309 Na+…). **Fix shipped:** (a) `VitrosDriver.build_order`/`build_no_order` override in the repo (`hazemdev` b625aa803), gated to VitrosDriver only (Maglumi/Udichem untouched), scp'd to the PC; (b) cloud data fix `GLU→300`, `MG→330` in `lab_device_model_tests` (model 6) + `lab_machine_test_mappings` (machine 10). After a `run.bat` restart (loads driver + re-syncs codes), the **query_sim confirmed** the reply is now `O|1|260618001292||^^^1.0+330+1.0|R||||||N||||5||||||||||O` (wrapped=True, header LIS2-A) — the exact format VITROS accepts. Result path already worked (result_sim posted 4 rows). **The framing (FN+checksum) was NOT the cause** — VITROS accepted the simplified frames (it reached the parse-layer "Unable to find program"); full E1381 reply framing is a deferred secondary conformance item. Full diagnosis: [report](https://moonui.elbaset.com/vitros-report.html) + workflow `vitros-astm-deep-dive`. Result-parse + sims verified end-to-end. ⏳ optional: confirm on a real VITROS when the lab runs it; dedupe two-instance-window duplicates; the FN+checksum framing follow-up.
- **VITROS integration AUDIT (independent, code-executed vs manual J32799, 2026-06-19):** core query→run→return loop **functionally complete + verified live**; the 3 load-bearing pieces (order-download format, no-order `L|1|I`, result parse) all conform. 27 requirements checked, gaps prioritized: **(production, config not code)** machine-10 code→investigation **mappings** only cover Glucose+Magnesium → other VITROS results land `matched=False` until the lab maps its menu; **(G3, code+cloud)** O-16 body-fluid hardcoded `5`=Serum (`DEFAULT_BODY_FLUID`) → non-serum chemistries mislabeled, needs specimen-type in the order payload; **✅ G4 FIXED** — M/E analyzer errors now surfaced (`VitrosDriver.parse` → `errors`, handler logs, commit ed03a3287); **(nice-to-have)** G1 NAK-retransmit (`_wait_for` doesn't resend), G2 R-7 flags stored raw + R-9 status unread, G5 catalog only 19 of the 4600 menu, G6 cloud GET inside the ~2.4s host-query window, G7 inbound O-3 drops Tray^Cup (barcode mode unaffected), **✅ G8 DONE** — full E1381 FN+checksum reply framing DEPLOYED + verified LIVE (query_sim shows the reply now framed `STX <FN>record CR ETX <C1C2> CR` per record: FN 1-4, checksums 14/3F/8C/07; records clean after strip). PC↔repo **drift check: all 20 .py byte-identical**. NONE block first-light. Full table in [report](https://moonui.elbaset.com/vitros-report.html) §🔎.
- **✅ DOWNLOAD-from-cloud DONE / ⏳ auto-update still pending (owner requirement 2026-06-19).** **Download half SHIPPED:** a **"Download Middleware"** button on Lab→Machines → `GET /lis/machines/download-middleware` (`LabMachineController::downloadMiddleware`, BE `b8e94cb71`, FE `cae2574`) zips `base_path('lis-middleware')` + injects `config.json` with **this install's `cloud.base_url` pre-filled** (`config('app.url').'/api'`) + email/password placeholders + READ-ME; skips runtime cruft. The middleware SOURCE now **ships with the app** — added `lis-middleware` to `config/moonstack.php` `package.include` (+ excluded its config.json/buffer/log/__pycache__), so every install serves its own matching middleware (cloud = the source of the code). So the operator: download → fill email/password in config.json → run.bat. **Still pending:** the *auto-update* half (client self-updates its middleware code on a version-check, no manual re-download) — not built; the download is manual-pull for now.
- **GOTCHA — device catalog vs machine instances (elmadina confusion 2026-06-19):** `/app/lab/machines` lists **machine INSTANCES** (`lab_machines`, per-lab, created manually with port/connection). The **CATALOG** (`lab_device_models`, global, 6 models) is what the seeder/update propagates + what the "Add machine (from catalog)" picker uses. An update adds/updates **catalog models, NOT instances** — so after updating, new analyzers (e.g. VITROS) appear in the catalog/picker (verified: elmadina `GET /lis/device-models` returns all 6), but the lab must still **Add machine → pick VITROS 4600** to create a working instance. "I see only 4" = the 4 instances, which is correct.
- **VITROS re-transmit loop (separate from the order bug):** ACK-timing on full-ASTM frames (ACK on STX+ETX before checksum) — fix behind a per-device `full_astm` flag (line above). NOT done.
- No Windows service/watchdog (run.bat only).

## Links
- Full guide (ships w/ code): `moon-erp-be/docs/lis/analyzer-middleware-guide.md`
- VITROS plan: https://moonui.elbaset.com/knowledge-base/plans/lis-vitros-onboarding-plan.html
- Audit: https://moonui.elbaset.com/knowledge-base/plans/lis-middleware-audit.html
- VITROS analyte codes: manual `public_html/Laboratory Information System (LIS) Guide.pdf` Appendix B (p.97).
