OIA Field Manual — Open-Source Intelligence Tradecraft
The required pre-exam reading for the GACS OSINT Intelligence Analyst (OIA) diploma. Codifies collection planning, sock-puppet OPSEC, SOCMINT, GEOINT, IMINT verification, dark-web tradecraft, and attribution discipline at the standard expected of a national-level open-source cell.
- 1
OSINT Doctrine & The Collection Cycle
Defines what OSINT is, what it is not, and the disciplined cycle every analyst follows from requirement to dissemination. This chapter is doctrine — every later chapter assumes it.
What OSINT Is — And Is Not
Open-source intelligence is information derived from publicly available sources that has been collected, exploited and disseminated to a specific intelligence consumer in answer to a specific requirement. Three words in that sentence do the work: 'publicly available' excludes hacked, leaked or stolen material accessed without authority; 'exploited' means processed, translated, geolocated, deduplicated, and corroborated, not merely captured; and 'specific consumer / specific requirement' means OSINT is never general curiosity — it always serves a decision. A screenshot is not intelligence. A captured page in a chain-of-custody bundle, geolocated to a building, time-anchored against an independent source, with a written confidence judgement, is intelligence.
The OSINT Collection Cycle
OSINT follows the same six-phase intelligence cycle as classified disciplines — planning & direction, collection, processing & exploitation, analysis & production, dissemination, feedback — but with a sharper emphasis on collection planning because the open web is infinite and an unplanned cell drowns. Planning converts the consumer's question into Priority Intelligence Requirements (PIRs) and Essential Elements of Information (EEIs), then to a collection plan that names the platforms, the operators, the OPSEC tier and the cut-off time. Processing normalizes raw take into hashed, timestamped, source-attributed artifacts. Analysis stress-tests competing explanations. Dissemination matches format to consumer. Feedback updates the PIR set.
Priority Intelligence Requirements & EEIs
A PIR is a decision-grade question — 'Is target X physically present at location Y this week?' or 'Which infrastructure cluster supports the brand network we observed last month?'. An EEI is the observable that, once collected, advances the PIR — a geotagged image, a DNS A-record, a Telegram channel admin handle, a wallet deposit address. The discipline is to write the PIR before opening a browser, then enumerate the EEIs, then assign each EEI to a collection method with an OPSEC tier and a deadline. Without that ladder, collection becomes scavenging and analysis becomes opinion dressed in screenshots.
Source Reliability & Information Credibility (Admiralty Code)
Every OSINT source is rated on the NATO Admiralty scale: source reliability A (completely reliable) through F (cannot be judged), and information credibility 1 (confirmed by other sources) through 6 (cannot be judged). A finding sourced 'B2' — usually reliable, probably true — is materially different from 'D4' — not usually reliable, doubtful. Analysts cite the rating in the evidence appendix, not just the link. Mirrored content, cached pages, archive copies and re-uploads are rated against the *original* source's reliability, not against the convenience of the archive.
Provenance, Chain-of-Custody & Hashing
Open-source material is only intelligence if it survives challenge. Every captured artifact carries: source URL (full canonical), capture timestamp in UTC with the analyst's clock source, capturing operator handle, SHA-256 of the file, a parallel save to an immutable archive (archive.today, the Internet Archive, or an internal evidence vault), and the rendering tool/version. Screenshots are augmented with a full-page DOM capture (HTML + assets) where possible because a screenshot can be cropped, recoloured or relabeled in seconds. The standard is simple: if a prosecutor, regulator, journalist or adversary cannot reproduce your evidence, you do not have evidence — you have a story.
Legal & Ethical Boundaries
OSINT operates inside the law of the analyst's jurisdiction, the platform's terms of service, and a written ethics policy. Bright lines: do not use stolen credentials; do not bypass authentication or paywalls except where explicitly licensed; do not impersonate law enforcement, government officials or the target's contacts; do not contact minors; do not deceive in jurisdictions that criminalize pretexting; do not collect against domestic populations without authority. The gray zone — scraping ToS-restricted pages, joining invite-only channels under a persona, archiving content from sites that prohibit it — is handled by a written rules-of-engagement document signed before the operation starts, not improvised mid-case.
Confidence Scoring & Analytic Caveats
Every OIA finding carries an explicit confidence level — low, moderate, or high — tied to source quality, source independence, and corroboration. 'High confidence' means the judgement is well-supported and unlikely to change without significant new information. Caveats are first-class content: 'attribution rests on infrastructure overlap and could be invalidated by shared-hosting reuse', 'the geolocation depends on a single shadow azimuth and is sensitive to date error of ±2 weeks', 'the persona linkage assumes the avatar reuse is not coincidental.' Overstated confidence destroys credibility faster than any single bad call.
- 2
OPSEC & Investigative Infrastructure
Covers the analyst's operational security posture, sock-puppet management, attribution-aware browsing, and the engineering discipline that prevents the target from learning they are being watched.
Threat Modelling The Investigation
Before the first query, model the threat: who is the target, what is their technical capability, do they own or operate any of the platforms you'll touch, can they pull server logs, and what is the cost to the operation if they detect collection? A romance-scam call-center cannot pull Cloudflare logs; a nation-state-linked influence operator can and will. The OPSEC tier of every artifact — browser profile, identity, network egress, payment instrument — is set against the strongest plausible adversary, not the average one. Overshoot is cheap; undershoot burns the case.
Network Egress & Attribution Surface
Default analyst egress (corporate IP, home IP, mobile carrier IP) is forbidden against any non-trivial target. The standard stack is: a dedicated investigation workstation or VM, a clean browser profile per persona, network egress through a residential or mobile proxy in a region consistent with the persona's cover, DNS resolution scoped to the proxy, no telemetry leakage from the OS, no logged-in personal accounts, and no shared clipboard between personas. WebRTC leakage, canvas fingerprinting, font enumeration, audio-context fingerprinting and TLS JA3 signatures are all tracked by mature adversaries; the browser profile must be hardened or replaced with a fingerprint-resistant browser.
Sock-Puppet Lifecycle Management
A sock puppet is a long-lived investigative persona with a credible backstory, consistent platform footprint, aged accounts, plausible network behavior and a documented OPSEC dossier. Lifecycle: define cover (name, DOB, location, employer, interests, photo set), register identity infrastructure (email, phone, payment) consistent with cover, age the accounts for weeks before activation, build organic-looking activity (likes, follows, comments, posts) on benign topics, and only then deploy against the target. Personas are tiered — cold (read-only browsing), warm (low-interaction following), hot (active engagement) — and a persona is *retired*, not reused across unrelated cases, because cross-case linkage of a puppet exposes every case it touched.
Identity Infrastructure (Email, Phone, Payment)
Persona email runs on a provider that does not require recovery linked to the real analyst; persona phone uses a clean SIM (eSIM, prepaid, or virtual number tier appropriate to the threat model — virtual numbers are burned by many adversary platforms) on a device that is not the analyst's personal phone; persona payment uses pre-funded virtual cards in the persona's name, never the analyst's card. The cardinal sin is the recovery cross-link: a persona email whose recovery address is the analyst's real address, or a persona phone whose carrier account is in the analyst's real name, collapses the entire identity on a single subpoena.
Image Hygiene & Visual Persona
Persona avatars and lifestyle images must not appear elsewhere on the indexed web. Generated faces (e.g. StyleGAN) defeat reverse-image search but carry their own tells (eye asymmetry, ear artifacts, background warping). Stock photos and stolen images fail reverse-image search instantly. Best practice is a small, internally generated, version-controlled image library per persona, audited quarterly against reverse-image engines (Yandex, Google Lens, TinEye, PimEyes-class tools used defensively).
Evidence Vault & Operational Logging
All collection lands in an evidence vault: append-only storage, per-case bucket, per-artifact metadata (hash, capture time, operator, source URL, rendering tool, OPSEC tier, classification). An operational log records who searched what, when, from which persona, against which platform, with which query — both for analytic reproducibility and for after-action review when something leaks. The log is itself sensitive: it reveals tradecraft and persona linkages and must be access-controlled.
Burn Conditions & Persona Retirement
A persona is burned when: the target or their associates challenge the persona's identity, the platform suspends or shadow-bans the account, an OPSEC failure links the persona to another persona or to the analyst, or the case closes. Burned personas are retired — their credentials are rotated to a sealed vault, no further activity is performed under them, and any open case they touched is reviewed for collateral exposure. Reusing a burned persona is the single most common way mature OSINT cells get caught.
- 3
SOCMINT — Social Media Intelligence
Covers persona-aware collection, pivoting and link analysis across social platforms, with emphasis on durable indicators that survive account churn.
Platform Affordances & Collection Surface
Every platform exposes a different collection surface. Public profile fields, post histories, follower/following graphs, reactions, group memberships, live streams, stories, comments, location tags, embedded media EXIF, link previews, app-attribution strings ('via iPhone'), and timestamps in the viewer's timezone vs the poster's. The analyst maps the surface per platform before collecting, because what you can see read-only versus what requires interaction determines the OPSEC tier of the engagement. Treat every platform's API and scraping posture as a moving target: capture aggressively when access is open, because tomorrow the field may be gone.
Username & Handle Pivoting
The most durable SOCMINT pivot is the handle. Targets reuse handles across platforms for ego, brand consistency, or laziness. Standard workflow: enumerate the handle and minor variants (underscores, year suffixes, leetspeak), check the handle across a broad platform set, capture each hit with provenance, and triage hits by recency, content alignment and biographical consistency. A handle hit is a *lead*, not a finding — corroboration requires a second independent signal (writing style, image reuse, mutual contacts, timezone activity pattern, shared infrastructure).
Email & Phone Pivoting
Emails and phones pivot through breach corpora, account-recovery probes (where lawful), business databases, social-platform sign-up checks ('this email is already in use'), and gravatar/avatar APIs. Recovery probes must be used with extreme care: many platforms notify the account owner of recovery attempts, burning the operation. Breach data is used as a *pointer* to public records, never quoted as evidence, because the legal status of breach material varies and its admissibility is fragile.
Network Analysis — Followers, Mutuals, Engagement Graphs
Lone accounts are weak; networks are strong. Map the target's first-degree contacts, then second-degree contacts shared with other accounts of interest. Mutuals that persist across multiple targets are high-value pivots — they often represent the operational core. Engagement graphs (who replies to whom, who likes within minutes of posting, who is in the same group at the same time) reveal coordination invisible to a profile-by-profile view. Output is a link diagram with edge weights and an evidence appendix per edge.
Writing-Style, Temporal & Linguistic Indicators
Stylometric indicators — punctuation habits, emoji choice, signature phrases, code-switching, recurring typos, time-of-day activity, language pairs — are durable across handle changes. Temporal indicators (consistent 03:00–11:00 UTC silence) constrain timezone and probable geography. Linguistic indicators (Cyrillic у masquerading as Latin y, regionally specific slang, machine-translation artifacts) constrain origin. None of these is dispositive alone; combined, they are powerful.
Coordinated Inauthentic Behavior (CIB) Detection
Influence operations are detected by coordination, not by content. Indicators: clusters of accounts created in a narrow window, near-identical bios, copy-paste posts within minutes across accounts, shared profile-image templates, identical follow patterns, the same external domain pushed by accounts that share no organic audience, and unnatural engagement velocity. CIB findings carry a structural argument and a per-account evidence table; never accuse an individual account of inauthenticity without showing the cluster behavior that justifies the call.
Capture Discipline — Posts, Stories, Streams
Stories and live content are ephemeral; capture immediately. Use full-page DOM capture plus screenshot plus media download where lawful. Record platform timestamps (poster-local, viewer-local, server-UTC where exposed) and the viewer's persona and proxy egress at capture time. Embedded media is hashed separately so the same image found later under a different post can be matched.
- 4
Image, Video & Geospatial Intelligence (IMINT / GEOINT)
Covers verification, geolocation, chronolocation, and adversary deception detection for visual material.
Image Verification Workflow
Every operationally important image is run through a fixed pipeline: provenance check (where did the file first appear), reverse-image search across multiple engines (Yandex outperforms Google in many domains, Bing in others), EXIF & XMP extraction, error-level analysis and noise inspection for splice indicators, and chronolocation against the visible environment. Generative-AI detection is performed but is treated as a *hint*, not a verdict — current detectors have meaningful false-positive and false-negative rates and must not be quoted as ground truth.
EXIF, XMP & Metadata — Use & Limits
Camera metadata can record device model, lens, ISO, shutter, GPS coordinates, capture timestamp in camera-local time, and software pipeline. It is genuinely powerful when present and intact — and almost always stripped by social platforms on upload. Treat EXIF on a platform-sourced image as missing-by-default; treat EXIF on an originals-channel image as forensically interesting but trivially forgeable, so corroborate with visual chronolocation.
Geolocation Tradecraft
Geolocation works by reducing the search space through stacked constraints: language and script on signage, road-marking style and color (which differs by country), electrical-pole and traffic-light geometry, vegetation and biome, license-plate format, architecture style, and matchable landmarks. The analyst forms a hypothesis hierarchy (country → region → city → district → street → exact frame), and each step is justified with a visible feature. Final confirmation uses satellite imagery, street-view, terrain models, and where ethical, real-estate listing photos. Document the chain: the audit trail is what makes the geolocation defensible.
Chronolocation — Dating An Image
Dating an image uses shadow azimuth and elevation (sun position constrains latitude × time-of-year × time-of-day), weather (cross-reference local METAR records), foliage state, snow cover, construction state of nearby buildings (compared against dated satellite history), event posters or graffiti dated by other sources, and platform-side timestamps. A shadow-only chronolocation typically carries a ±2 week window, not a single day, and the report must say so.
Satellite & Aerial Imagery
Free and commercial satellite sources (Sentinel-2, Landsat, Planet, Maxar where licensed) provide dated overhead imagery. Workflow: identify the AOI, pull the image stack, perform change detection between dates, annotate features with measurement (length, area), and overlay with vector context (roads, admin boundaries). Cloud cover, sun glint and off-nadir distortion are recorded; conclusions about counts or movement are caveated by image resolution and the time-gap between captures.
Video Verification & Frame-Level Analysis
Video adds temporal continuity but also adds tampering surfaces (cuts, re-encodings, mirroring, speed changes, audio re-laying). Workflow: extract keyframes, hash, run each through the still-image verification pipeline, examine compression artifacts at cut points, examine audio for splice indicators (level discontinuities, room-tone shifts, lip-sync drift), and reverse-search the leading frames to find prior uploads. A video found earlier under a different caption is decisive evidence of recontextualization.
Synthetic Media & Adversarial Manipulation
Generative imagery and deepfake video are now operationally relevant. Tells include: anatomical errors at hand and ear, hairline blending failure, background warping at occlusions, temporal flicker in deepfake video, audio cadence and breath patterns inconsistent with the speaker's archive. Provenance signals (C2PA, content credentials) help where present. The analyst's posture is to assume sophisticated synthesis is possible, demand independent corroboration before treating a single dramatic image or clip as decisive, and to write the synthetic-media risk into the caveat block.
- 5
Infrastructure Intelligence (Domains, DNS, Hosting, TLS)
Covers durable adversary attribution through the infrastructure they reuse — domains, DNS, certificates, hosting, and operational identifiers.
Why Infrastructure Beats Personas
Personas are cheap and disposable; infrastructure is expensive and sticky. Adversaries rotate handles weekly and infrastructure quarterly. Attribution that rests on a name will die when the name does; attribution that rests on an infrastructure cluster — a reused name-server, a TLS certificate fingerprint, a Google Analytics ID, a recycled bulletproof hosting block — survives the rotation. The OIA standard is to anchor every attribution to at least one durable infrastructure indicator.
WHOIS, RDAP & Registrant Pivoting
WHOIS and its modern successor RDAP expose registrar, creation/expiry dates, name-servers, and (where not redacted) registrant identifiers. Historical WHOIS is often more useful than current WHOIS, because adversaries who later privatize their records left clear ownership data in earlier snapshots. Pivot points: registrant email, registrant phone, registrar account ID, name-server set, and creation-date clustering. A shared name-server pair across forty otherwise unrelated domains is a high-value cluster signal.
DNS Records & Passive DNS
Active DNS gives the current record set; passive DNS gives the historical record set assembled by sensors across resolvers. A passive-DNS view shows every IP a domain has resolved to, every other domain that has shared those IPs, and the time windows of each association. This is how shared hosting (low signal) is distinguished from a tight bulletproof cluster (high signal). MX records reveal mail infrastructure, TXT records reveal SPF/DKIM/DMARC and frequently leak third-party service identifiers (Google site verification, Microsoft 365 tenant IDs, SaaS verification tokens) that pivot across the cluster.
TLS Certificates & Certificate Transparency
Every public TLS certificate is logged to Certificate Transparency. CT logs expose every name a certificate covers, the issuer, the validity window, and the certificate fingerprint. Operationally this means: a single Let's Encrypt issuance for fraud-portal-77.example also lists fraud-portal-78.example as a SAN, exposing the next domain in the cluster before it goes live. JA3/JARM fingerprints of the TLS server give an additional clustering dimension for adversaries who don't rotate their stack.
Hosting, ASN & Bulletproof Identification
Every IP maps to an Autonomous System (ASN) with an owner, country and reputation. Mainstream hyperscalers carry low signal — millions of legitimate tenants share their ASNs — but a cluster of fraud domains all resolving inside a small bulletproof provider's /24 is a strong attribution anchor. Maintain an internal ledger of known bulletproof providers, fast-flux providers, and reseller chains that obscure ownership; refresh quarterly because the market churns.
Operational Identifiers (Analytics, Tag Managers, SDKs)
Front-end source code leaks identifiers operators forget to vary. Google Analytics IDs, Google Tag Manager IDs, Facebook Pixel IDs, AdSense publisher IDs, Hotjar site IDs, Segment write keys, and JS SDK config blocks frequently recur across an operator's portfolio of sites. Source-code search engines (e.g. PublicWWW-class tooling) pivot from an identifier to every other indexed page that contains it. This is the highest-yield infrastructure pivot for low-budget operators who use the same marketing stack across all brands.
Email & Mail-Server Infrastructure
Phishing and fraud-grooming infrastructure leaks via mail headers: source IP of the sending MTA, return-path, DKIM signing domain, message-ID format, and X-Mailer strings. Capture full headers on every operationally relevant message; never quote only the body. Mail-server clustering by sending IP and DKIM domain often identifies the upstream infrastructure provider when the public-facing domain is freshly registered and otherwise unattributable.
- 6
Dark Web, Forum & Messenger Intelligence
Covers Tor and I2P collection, forum tradecraft, encrypted-messenger intelligence and the tradecraft of operating inside hostile-platform environments.
Tor & I2P — Access & OPSEC
Tor and I2P are routine OSINT collection surfaces, not exotic territory. Access is performed from a dedicated investigation workstation or VM with Tor Browser (do not customize aggressively — fingerprint differentiation defeats the anti-fingerprinting design). Do not log in to clear-net accounts while on the Tor circuit. Do not download executables from .onion services to the analyst host. Do not run JavaScript on unknown .onion services unless the threat model permits it. Capture pages with archive tools that preserve content as well as the .onion address.
Onion-Site Discovery & Verification
Discovery channels include indexed directories (Ahmia and equivalents), market-reference lists, leak blogs, and pivots from clear-net mentions of an onion address. Verify before quoting: phishing clones of major markets and leak sites are routine. The standard is to anchor an onion address to two independent references (operator's signed announcement, mirror list, PGP-signed message in the announce channel) before treating it as canonical.
Marketplaces, Leak Sites & Forum Tradecraft
Marketplaces, leak/extortion blogs and underground forums each have distinct norms. Registration usually requires invitation or a low-effort referral fee. Read-only collection is preferred; engagement requires a hardened persona tier and explicit authorization. Mirror everything immediately — leak blogs go down on takedown notices; marketplaces exit-scam; forums purge accounts. Treat any single mirror as fragile; preserve the original onion address, the snapshot, the hash, and at least one independent capture.
Encrypted Messengers — Telegram, Signal, Matrix, Discord
Public Telegram channels and supergroups are durable OSINT surfaces; Signal is largely closed; Matrix and Discord are mixed. Telegram pivots: channel ID (immutable), creator-time, admin set, forwarded-from chains, embedded media hashes, and the channel's mention graph. Capture continuously — adversary channels are nuked on report. Persona policy: use a phone-number tier appropriate to the threat model; do not use the analyst's real number for any channel that the target controls or moderates.
Cryptocurrency Indicators In Dark-Web Context
Onion-hosted services advertise BTC, XMR and stablecoin payment addresses for products, ransoms or escrow. Capture every address with the offering page and timestamp; pivot on-chain (see Chapter 8 of MICA / cluster the address against known services). XMR is largely opaque to clustering by design; treat XMR-only operators as attribution-hardened and rely more heavily on infrastructure and persona pivots.
Hostile Platforms & Counter-Surveillance Signals
Many hostile platforms watch their watchers: time-on-page, scroll behavior, JS execution, downloaded assets, and cross-page browsing pattern. Sophisticated operators deploy honeypot links, persona-trap pages, and watermarked download files that ping their server on open. Behave like a member, not a researcher: throttle pace, vary access times, do not enumerate the whole site in one session, and never download a watermarked asset to a host that can be traced back.
- 7
Attribution, Analysis & Reporting
Pulls the discipline together: ACH against the evidence, attribution at the right level of confidence, and a written product that survives challenge.
Levels Of Attribution (Persona → Operator → Organization → Sponsor)
Attribution is a ladder. Persona attribution links a handle to an identity. Operator attribution links a human to a set of operational behaviors. Organizational attribution links a set of operators to a structured group with shared infrastructure, tooling, and tasking. Sponsor attribution links an organization to a backer (criminal, corporate or state). Confidence falls sharply as you climb the ladder, and the cost of being wrong rises sharply. The OIA standard is to publish the *highest defensible* rung and to write the gap to the next rung as an explicit collection requirement.
Analysis Of Competing Hypotheses (ACH) For Attribution
ACH against attribution: list every credible explanation for the observed indicator set (true actor, copy-cat, false-flag, shared toolchain, coincidence), score each evidentiary item as consistent, inconsistent or not-applicable against each hypothesis, and report the surviving hypothesis as the one with the least disconfirming evidence — not the most confirming. False-flag operations specifically exploit confirmation bias by planting indicators consistent with another actor; ACH disciplines you to look for what is *missing* in the evidence as well as what is present.
Indicators & Warnings (I&W) — Trigger Discipline
Define, before they occur, the observable signals that would shift your confidence in either direction — a new domain in the cluster, a new persona joining the channel, a wallet address resurfacing, an infrastructure provider rotation. Commit to the I&W list in writing; review it on a fixed cadence. I&W discipline is what separates a living intelligence picture from a one-shot report that ages out the day it ships.
Writing The Report — BLUF, Findings, Evidence Appendix
The OIA standard product opens with a BLUF (3–5 sentences: judgement, confidence, so-what). The executive summary expands for non-specialist readers. Findings are numbered, each with confidence, sourcing, and the specific evidence trail. An evidence appendix preserves URLs, archives, hashes, capture times and operator. Recommendations are separated from findings so the consumer can act without confusing your judgement with your preference. Every paragraph either advances the judgement or it is cut.
Caveats, Confidence & Honest Failure
Write the caveats first, not last — they shape the rest of the prose. Explicitly state what the evidence does *not* support, where the chain is single-sourced, and what would invalidate the judgement. Honest failure ('we could not resolve attribution beyond the operator level given current collection') is a stronger product than overreach. Update published findings in writing when new evidence arrives; build the update cadence into the product, not into the analyst's memory.
Tasking The Next Cycle
Every product closes the cycle by tasking the next one: which information gaps remain, which I&W are now in force, what collection (open-source, on-chain, legal-process, partner-liaison) would move the next rung of attribution, and what the consumer's revised PIRs are. An OIA case is never finished — it is paused at a confidence threshold, with an explicit trigger list for re-activation.
Ready for the OIA exam?
100 scenario-based questions · 70/100 to pass. Your diploma is auto-issued to your account name on success.
GACS.app — Academic & Intelligence Standards Division
