Quick guide to open-source intelligence
By Manuel Medina, Intelligence Analyst, Basel Institute on Governance
Also available in: Español
Anti-corruption, transparency and freedom of information initiatives over the last decades have significantly boosted the value of open-source intelligence for both the private and public sectors.
In this quick guide, Intelligence Analyst Manuel Medina explains what open-source intelligence is and explores some of the tricky questions it raises.
How OSINT fits into the intelligence cycle
Open-source intelligence, or OSINT as it is known in the intelligence community, is the systematic collection, processing and analysis of open-source information. This means information that is freely available to the general public, such as media articles,
research reports or company records in an open business registry.
As I explained in my previous quick guide to intelligence, it is helpful to think of intelligence not as a static object but as a process. The output of this process
can be a strategic or operational product. In both cases it is designed to help decision-makers understand a specific topic or threat in order to decide how to act to reach specific goals. If intelligence is a process, then open-source information
is one of several possible inputs that can go into that process.
Open-source information alone can be a powerful tool to support evidence-based decision-making. But depending on the context and need, analysts can also combine open-source and confidential information in the intelligence cycle.
For example, an informer or whistle-blower may give a tip-off that a particular person has undisclosed and suspect business interests. Open-source information from company registries, websites and media articles can be used to try to corroborate this.
Can OSINT support law enforcement?
Yes, in fact it’s essential. In many countries public media reports, especially by investigative journalists, can trigger investigations into suspected financial crimes. In some cases, individuals even give themselves away with information they make publicly
available on social media. This happened in the case of the Ambuila family in Colombia, whose daughter posted pictures of herself
online with luxury goods and a Lamborghini.
And as my colleague Tom Walugembe describes, the first money laundering case in Uganda started with a widely distributed
WhatsApp video showing the criminals posing next to bundles of US dollars in their apartment. Not very intelligent.
Where does OSINT come from?
We tend to associate the term “open source” with digital information, but it can also be physical. Analysts and investigators used to have to go to libraries and government offices to get hold of public records and articles. You’ve seen the films where
police leaf through piles of old newspapers in dusty archives?
Now, a lot of this information is available on the internet and analysts spend more time behind their desks. But there’s still a lot of open-source information on paper in the world, especially in countries with less developed IT systems.
Note that freely available doesn’t mean free. Lots of journals and media articles are behind paywalls, for example. Paying for freely available information is a bit of a grey question and frankly a headache for intelligence teams without a lot of resources.
But as long as the cost isn’t so huge that it’s prohibitive and there are no other restrictions on people’s access to it, we still generally classify this data as open-source.
The value of OSINT for both public and private sectors
Open-source intelligence is immensely valuable to law enforcement. It generates leads, corroborates information received from other sources and backs up evidence used in court cases.
It’s also incredibly useful to businesses and financial institutions for conducting risk assessments and due diligence on clients, employees and third parties. Open sources can be used to identify fraud or counterfeit schemes, or to investigate the claims
of a whistle-blower. Strategic business intelligence is also usually based on open-source market data.
Is information power? Not any more
The fast expansion of the internet and search tools over the last couple of decades has made masses of information more easily available to the public. A few clicks now take us to newspaper archives going back several years, records of court judgements
or downloadable databases of statistics and measurements.
But paradoxically, having fast and easy access to huge amounts of data means that the old saying “information is power” is not so true anymore.
First, there’s the issue of fake news – i.e. deliberately misleading articles or social media postings with no factual basis. Fake news that goes viral adds fuel to the (fake) fire. Analysts have to go through extensive procedures to verify the reliability
of open-source information, both when initially collecting it and again during the integration and processing stages.
Second, too much information can easily infoxicate us and cloud the decision-making process. Being able to gather relevant data and extract useful patterns is power,
not information alone.
How to fight infoxication
Infoxication-fighting is a serious issue. If you have spent hours getting lost down rabbit holes on the internet while researching a topic, you’re not alone. Tools that help speed up the search process, extract relevant information and connect the dots
are really valuable here.
Our own Basel Open Intelligence tool is designed to do exactly that. It performs one-click automated searches of an individual’s or organisation’s name in combination
with over 200 keywords on financial crimes, judicial actions, other criminal offences and custom keyword lists. It also trawls the deep web, picking out any references on lists of sanctions and politically exposed persons.
Helpfully, you can search for multiple aliases and different name variations at the same time. It’s both amazing and frustrating how many different variations and transliterations of names there can be, especially for languages with different alphabets
or complicated name structures. Some people use aliases to disguise their real identities.
The algorithm uses natural language technology to extract entities and show related persons, companies, locations and professional positions. The documents found in the search are listed along with the extracted main text of the website, excluding irrelevant
content like advertisement, menus or cookie notices. Keywords are highlighted for easy skimming. The user can quickly filter and sort the documents according to relevance.
All this saves a huge amount of time and lets users gather just the information they need - and no more.
Open-source information plus automated translation – problem solved?
Basel Open Intelligence is also designed to help overcome other practical challenges faced by investigators, analysts and compliance officers conducting background checks or due diligence using open sources.
Language is a big one. Until recently, the problem has been the language barrier – that analysts don’t understand information in languages they don’t speak. Translation takes time and is costly, especially if you don’t know whether the source
is relevant or not.
Automatic translation tools like DeepL and Google Translate have made understanding open-source information in other languages faster and cheaper. But now we have a new problem: regular search engines look primarily for results in the user’s own language.
This makes it likely that analysts relying on regular search engines won’t even find potentially crucial information in other languages, let alone have the chance to analyse it.
To help solve this, Basel Open Intelligence conducts searches in multiple languages. It also gives users the option to automatically translate articles into their own language for easier analysis.
A word of caution
When it comes to open-source intelligence, tools are there to help humans and not replace them. Language is a good example: multilingual search and automated translation are hugely valuable but don’t replace human understanding of the context and cultural
or political connotations. A word like “paramilitary” or even “corruption” can have very different meanings in the context of Colombia, Ireland or China.
Humans also have a better idea of the information they’re missing, not just the information they can find using open sources. Although many company or beneficial ownership registries are online and accessible, for example, others lack transparency.
Most of all, open-source information is a powerful beast but it needs harnessing. The ability to extract relevant information from the mass of openly available information on the internet and elsewhere, and to turn it into something useful… now that’s
power. That’s intelligence.
Download a PDF of this quick guide.