AI Futurist Technical Specification

Read more about the technical details of how AI Futurist operates, including its data processing and storage mechanisms and security and privacy protocols.

AI Futurist components explained

AI Futurist has two main components: the retrieval engine and the augmentation engine. The former is responsible for searching and retrieving relevant and diverse content from the platform data, based on the query and the context. The augmentation engine is responsible for generating texts, based on the data provided by the retrieval engine. The retrieval engine and augmentation engine work together to produce the outcome by interleaving the generated and the retrieved content and ensuring the consistency and coherence of the text.

The retrieval engine

The retrieval engine is the component that searches and retrieves relevant and diverse content from the platform data, based on the user-provided query and the context. It identifies relevant text snippets from different phenomena. The retriever uses a combination of semantic and syntactic methods, which include embedding, indexing, and ranking, to retrieve content that is relevant and diverse, and that matches the query and the context.

The generation engine

The retrieved data then serves as a foundation for the generation engine. This engine leverages a Large Language Model (LLM), which is trained on a gigantic, general corpus of texts, to create new text that seamlessly complements the retrieved information. It may synthesise existing insights, propose novel connections, or even generate entirely new scenario elements based on the user prompt.

By combining these elements, AI Futurist offers a unique blend of expert-curated data and AI-powered generation, ensuring users receive insights grounded in both knowledge and innovative analysis.

It’s worth noting that the retrieval engine only has access to data on the platform, i.e. Futures Platform and user-generated content. So, in most cases, AI Futurist's answer is well grounded on the curated foresight data. However, when a user queries on topics where the available content is scarce or even missing, the generated text may be partly or fully based on the LLM (i.e. based on the data the LLM was trained on).

Data processing and management

Responsible data handling is at the core of the AI Futurist. Our data processing pipeline follows strict protocols to ensure the accuracy, security, and ethical use of information:

Data sourcing: As the source data, we only use data on the platform to guarantee reliable and trustworthy information. Data consists of standard Futures Platform phenomena available for all users, as well as users’ own content (in the English language) that they have added to the platform.
Data storage: All data is securely stored within our standard data infrastructure and tools on the Google Cloud Platform, benefiting from industry-leading service quality and security measures. All content is stored in the ElasticSearch.
Data embedding: In the context of RAG the content is transformed into numerical vectors that can capture the semantic and syntactic similarity of the texts. We use (language) models that are available and hosted in the Vertex AI to generate the embeddings.

Security and privacy

AI Futurist is committed to ensuring the security and privacy of your data and texts. We use the following measures to protect your data and your texts from unauthorised access, use, or disclosure:

We use Google Cloud Platform (GCP) as our cloud service provider, which offers a high level of security and compliance, such as encryption, authentication, authorisation, auditing, etc.
We use Vertex AI as our machine learning platform, which is a GCP service that allows us to build, deploy, and manage our machine learning models, such as the retrieve engine and generation engine, in a secure and scalable way.
We use ElasticSearch on GCP as our database, which allows us to store and manage all raw content and their embeddings in a secure and scalable way.
We use a separate index for customers’ own content, which is isolated from general platform data, enabling users to utilise their (and only their) own data with the AI Futurist.
In addition to the industry-leading Generative AI security measures provided by the GCP and Vertex AI, we apply several additional security controls at each stage of the RAG process.
We do not store or share the generated texts, unless the users explicitly consent to do so, for example, for feedback or improvement purposes.
We do not store users’ data or texts for any other purposes, other than providing the AI Futurist service.