Frontier Research

4/1/2024 0 Comments

Red Reaper: Building an AI Espionage Agent

Introduction
Red Reaper Espionage AI autonomously uncovers key espionage data, from wire transfers and blackmail opportunities to sensitive mergers and confidential negotiations, showcasing autonomous capabilities in nefarious intelligence collections. The Red Reaper proof-of-concept (PoC) serves as a powerful example of the potential AI-driven threats sophisticated adversaries are already harnessing or interested in developing.

The security community, and likely beyond, is now aware of the Chinese I-Soon leaks. The leaks are notable for revealing how Chinese offensive systems use advanced data science to categorize stolen documents and to intricately map the network connections of the document creators. This represents a significant shift in how adversaries are embracing AI, intensifying the already asymmetric balance between cyber offense and defense. This shift allows adversaries to process and utilize stolen data at unprecedented speeds. For defenders, this means facing challenges that include the rapid identification and mitigation of breaches, staying ahead of automated attacks that evolve faster than traditional response strategies, and devising defenses that are effective against AI-driven threats.

The I-Soon disclosure of such advanced espionage AI systems has inspired us to develop a comparable AI, taking creative liberties and aiming to demystify its workings for the security community and enhance awareness of this evolving cyber threat. The first installment in this series is a blog post that introduces 'Red Reaper,' an AI espionage agent. It provides an in-depth exploration of the technical specifics and capabilities of this tool. Subsequently, we plan to release the Proof of Concept (PoC) source code as part of a training module developed in partnership with CyberDagger. This initiative is designed to expedite the security community's comprehension of these burgeoning digital arsenals.

System Components

Dataset: ENRON email corpus
Python ML/AI libraries
Neo4j graph database (graph data science)
Large language model: Mixtral 8x7B HF
Simple Flask API (for frontend)
Simple UI

Data Science Layer 1: Enhancing Named Entity Recognition with Custom Embeddings
In the initial phase of our project, we sought to leverage the capabilities of the spaCy Named Entity Recognition (NER) model, specifically the "en_core_web_lg" variant, with an emphasis on identifying entities categorized as "LAW." We hypothesized that focusing on legal terminology could unveil sensitive discussions or attachments, potentially pivotal in identifying espionage grade material. However, a notable challenge with the spaCy NER model is its absence of confidence scores for the classification of entities. This gap makes it difficult to ascertain the relevance of the detected entities, introducing noise into our analysis.

To circumvent this limitation, our approach involved delineating the "LAW" category using a carefully selected list of terms that effectively represent the category, or how we would ideally like to see the category represented. We then calculated the mean of the embeddings generated from this list to create a composite "law semantic feature space embedding." This embedding encapsulates the core semantic attributes of legal terminology that we felt could help identify sensitive business communications. By comparing this mean embedding with the embeddings of sentences containing entities recognized by the spaCy model using cosine similarity, we established a confidence scoring mechanism. This technique not only allowed us to filter out non-relevant matches more efficiently but also played a critical role in subsequent data science layers. It enabled the further detection of information with potential espionage implications, underscoring its significance across different analytical stages.

get_entities() Function
The code snippet below defines a function "get_entities" that utilizes spaCy's “en_core_web_lg” model to perform Named Entity Recognition (NER) on a set of documents (emails), with the aim of identifying entities within specific categories of interest, notably those relevant to espionage. It iterates through each document, cleans the text to remove encoding errors, and lowers the case for keyword exclusion checks.

Documents containing any keywords from a predefined list of non-relevant terms are skipped to enhance processing efficiency – these were to exclude news articles being shared. For the remaining documents, the function applies the NER model to identify entities (“LAW”) that match the categories specified in "filter_list."

For each detected entity, it calculates a confidence score based on cosine similarity between the entity's surrounding sentence embedding and a predefined "law semantic feature space embedding." This process aims to determine the relevance of each entity to the categories of interest. Relevant entities are then documented, including their labels, text, confidence scores, and sentences they appear in. Finally, if any entities are found, the details are appended to a JSONL file for further analysis. This approach allows for the targeted extraction of potentially sensitive information from large volumes of text, leveraging NER and custom embeddings to refine the accuracy and relevance of the findings.

get_confidence() Function
This below code segment is designed to process and analyze defined legal terminology through the lens of vector space models, specifically using word embeddings. Initially, it defines a list named “LAW_Tokens” containing keywords associated with legal concepts such as contracts, compliance, intellectual property, and corporate governance. It then loads a pre-trained word embedding model from a file, storing each word's vector in a dictionary for efficient access. This model, sourced from fasttext.cc, provides vector representations for over two million words, capturing semantic and syntactic nuances.

The core functionality revolves around two main processes: vector retrieval and centroid calculation. The “get_word_vector” function retrieves a word's embedding from the dictionary; if the word is not found, it returns a zero vector, ensuring compatibility with the overall vector space model. The “get_embedding_centroid” function computes the centroid of embeddings for a given list of words, effectively finding the average vector that represents the semantic "center" of these words in the multi-dimensional space defined by the embeddings. This is particularly useful for capturing the essence of a complex concept like finding espionage grade communications through its constituent terms.

Finally, the code calculates the centroid for the law-related tokens and defines a “get_confidence” function to compute the cosine similarity between the centroid of an entity's sentence text and the law centroid. This similarity measure serves as a confidence score, quantifying how closely the semantics of the entity's sentence text align with the concept of law as defined by the "LAW_Tokens".

Data Science Layer 2: Graph Analysis
In the second layer of our data science pipeline, we've significantly narrowed our focus from an overwhelming 517,402 emails to a more manageable 20,622, thanks to the initial filtering processes. At this stage, we leverage graph data science techniques to illuminate the complex web of communications between individuals who are involved in the creation and sharing of sensitive business information. Our primary objective is to construct a comprehensive communication graph using the data curated in the previous layer. However, our ambitions extend beyond mere assembly.

We aim to delve into the intricacies of this communication network, identifying sub-communities that are particularly active in generating and circulating sensitive content. Additionally, our analysis seeks to pinpoint key figures within these networks—individuals who play pivotal roles as influencers, sharers, or creators of critical information. By singling out these central figures, we can direct our analytical efforts more precisely, ensuring that we scrutinize the most relevant interactions for potential insights.

To achieve these goals, we employ graph database technology (Neo4j), coupled with its graph data science library. This toolkit enables us to apply advanced algorithms such as Louvain for community detection and degree centrality metrics to assist in the final identification of the most influential participants within the communication landscape. This strategic approach sets the stage for further in-depth analysis, incorporating generative AI techniques to extract nuanced insights from the complex patterns of communication we uncover.

The resulting database schema emphasizes the relationships between documents (specifically emails), the entities involved in these communications, and the sentences extracted from these documents that are deemed of high value.

build_graph_from_jsonl() Function
The code below defines a function “build_graph_from_jsonl” aimed at constructing a graph database from data extracted from the JSONL file, which contains processed email communications. The function first processes the JSONL file to extract relevant data, iterating through each item. For each email document, it identifies sentences with a confidence score above a specified threshold (0.44), considering these sentences as high-value content.

For documents containing such high-value sentences, the code proceeds to create nodes within the graph database for the document itself, as well as for the sender's email (tagged with a "SENT_FROM" relationship) and recipient emails (each tagged with a "SENT_TO" relationship). It then creates nodes for each high-confidence sentence and establishes a "CONTAINS" relationship between the document and these sentences, effectively linking documents to their critical content.

Following the construction of these basic relationships, the code enhances the graph with additional analysis layers. It projects the main dataset in memory, facilitating further GDS operations within the graph. Community detection algorithms are applied to uncover groups within the network, aiming to identify clusters of communication groupings. The PageRank algorithm is run to identify influential nodes, which could highlight key documents or participants in the communication network. Degree centrality calculations are performed to further analyze node importance based on connections, emphasizing the most interconnected documents or entities. Finally, the projection created for analysis is dropped, concluding the structured analysis process.

Deciphering Key Communication Clusters in Emerging Communities
At this stage, our analysis unveils fascinating insights into communication networks, revealing patterns that, while intuitive to humans at a glance, present analytical challenges for computational methods. By integrating community detection with degree centrality metrics, we enable the system to identify and quantify the nuances of communication clusters that might otherwise be overlooked.

Take, for instance, community 1031, highlighted in our analysis below. A manual examination of the graph visually exposes areas of concentrated communication. However, our algorithms provide a quantitative understanding of these clusters, attributing numerical values to concepts like connectivity (degree), influence (PageRank), and group membership (community labels). This numerical quantification is pivotal, as it equips our system with the ability to replicate human-like discernment in sifting through vast amounts of data.

Delving deeper into community 1031, our algorithms have proven effective in pinpointing essential communicative nodes. The individuals with the highest degree centrality within this cluster—Louise Kitchen (President, Enron Online), Sally Beck (Chief Operating Officer), and John Lavorato (CEO, Enron America)—are not only key figures within the organization but are also logically situated within the same communication cluster. This correlation exemplifies the algorithm's capability to identify significant patterns within the data.

Data Science Layer 3: Generative AI
In the final layer of our data science framework, we leverage generative AI to synthesize and further refine the analysis of espionage-relevant material identified in previous layers. This process begins with an evaluation of entities and sentences within the context of their communication communities, utilizing a sophisticated graph structure to identify top entities and their associated high-value sentences based on community influence and content significance.

As we dive into this layer, the system employs a generative AI model, to assess each sentence's potential value for espionage, blackmail, extortion, or other criminal activities. This assessment is not just about raw data extraction but involves a nuanced understanding of the content's context and potential misuse in the hands of state-sponsored adversaries or criminal entities. The AI model considers each sentence in isolation, imagining a scenario where this information could be weaponized. It categorizes the sentences into levels of potential value and suggests the type of crime they could facilitate, alongside a targeted analysis for each piece of information.

This intricate process not only identifies the most sensitive and operationally valuable information but also enables the creation of targeted strategies for leveraging this information. We intentionally fell short here as we felt providing a walkthrough on advanced operationalizing techniques would be irresponsible. By analyzing communication patterns and content at a granular level, the AI can pinpoint specific pieces of data that stand out for their potential misuse. This strategic analysis is operationalized by creating "Analysis" nodes within the database, each tagged with a value rating and detailed insights into why and how the information could be exploited.

Simple Flask API & UI: Unveiling Insights

Although the technical details of the Flask API and UI are briefly touched upon, their contribution to the outcomes of this project aren’t significant and won’t be added. The analysis began with over half a million (simulated) stolen emails, leading to the identification of 385 emails deemed operationally critical from an espionage standpoint. These included potential leads on wire transfers, blackmail schemes, sensitive merger discussions, and confidential negotiations. The Red Reaper project emerges as a compelling demonstration of the advanced capabilities AI can offer in the realm of intelligence gathering for nefarious purposes.

The proof-of-concept (PoC) not only highlights the sophistication of potential AI-driven threats but also raises alarms about current adversary capabilities and interests. Built on insights gleaned from leaked adversary capabilities, Red Reaper was assembled in just a few weeks by a single engineer. This underscores the profound implications and the scale of threats that could be realized if such technologies were deployed at the level of nation-states.

A particularly noteworthy discovery made by the Red Reaper project was an email chain that hinted at a potential extortion opportunity. This chain contained discussions among Enron executives about the possibility that they might have breached the Foreign Corrupt Practices Act. Such information, if exploited by adversaries, could serve as a significant leverage point for extortion.

UI: Red Reaper Insights

0 Comments

YOUR CART

Red Reaper: Building an AI Espionage Agent

Frontier AI & Cybersecurity Research