The CDM has the potential to become a crucial standard in the financial sector, enabling consistent and interoperable data models across institutions. However, its complexity and extensive documentation make it challenging for newcomers and even seasoned professionals to fully grasp and implement.
Recognizing this challenge, the community identified the application of LLM paradigms to CDM as a valuable use case. Over the summer, a collaborative effort involving TradeHeader, Provectus, ISDA, and research assistants led to the development of prototypes showcasing how LLMs can process and interpret CDM documentation and code.
Showcasing the Chatbot Prototype
One of the key prototypes presented by Provectus was a chatbot designed to assist users in understanding CDM. By priming the chatbot with extensive data from CDM documentation and source code, the Provectus team enabled it to answer complex technical questions, generate JSON examples of financial instruments, and provide detailed explanations about CDM workflows.
- Minimizing Hallucinations: The team utilized the advanced RAG technique, focusing on prompt engineering and adjusting model parameters to minimize AI hallucinations, ensuring accurate and reliable responses. Fine-tuning the underlying AI model’s parameters was essential to enhance its performance.
- Utilizing Source Code: By integrating the actual CDM source code, the chatbot could provide precise answers and even suggest improvements or modifications to the CDM.
Parsing Legal Documents with AI
Another significant application presented by ISDA was using LLMs to parse and standardize legal documents, such as Credit Support Annexes (CSAs). CSAs are complex and vary widely in language and structure, making them difficult to digitize and integrate into models like CDM.
- Multi-Agent Architecture: The team developed a multi-agent architecture where specialized agents focus on extracting specific clauses from the document and standardizing them into CDM representation.
- Automated Extraction: This approach aims to automate the extraction and standardization process without extensive fine-tuning and making it scalable.
Improving Accessibility and Understanding
TradeHeader presented a summarizer tool that transforms complex CDM JSON representations into more understandable formats for business analysts. The report generated provides a comprehensive transaction description using the following components:
- CDM Native Functionalities: Qualification modules designed to extract information from the CDM event and product structures, to be used as additional context for the custom LLMs prompts.
- Custom Models for Specific Tasks: Specific LLMs were used to handle different aspects of the CDM.
- Trade model: Resolves the extraction of trade and event-specific data such as involved parties, trade IDs, business events or dates relative to trade lifecycle processes.
- Product model: Extract product-specific features of the model, such as party roles, underlying instruments, leg structures, or settlement dates.
- Additional information model: Generates a summary enhancement section that includes extra product features giving the summarizer the ability to capture less common CDM structures such as stubs, event-specific flags, or underlying instrument details.
By using LLMs, TradeHeader bridged the gap between technical and non-technical stakeholders, enhancing collaboration and efficiency. In the future, the goal is to develop a foundational model capable of fully understanding and interpreting CDM, reducing reliance on hard-coded prompts, expanding product coverage and exploring other open source AI solutions.
FinGPT for Regulatory Interpretation and Compliance
Chief compliance officers have long envisioned a system capable of interpreting regulations and automatically generating compliance reports for active trades. At the workshop, researchers from Columbia University and Rensselaer Polytechnic Institute (RPI) introduced an agent built with FinGPT that aims to solve this problem.
Although still in the early stages, the two main outcomes envisioned are i) a Validator to ensure lineage consistency between regulatory requirement and reporting output, and ii) a Rule Generator/Copilot to create new ISDA’s Digital Regulatory Reporting (DRR) rules based on new regulatory requirements.
This work has the potential to enhance the DRR system by reducing maintenance costs and improving scalability across multiple jurisdictions, such as Japan and Australia. We welcome market participants eager to contribute with funding and/or capacity. The agent source code will be made available in the FINOS Labs EMIR-specific RAG repository, models and data in the newly created Hugging Face repository.
The innovative applications showcased at the workshop underscore the transformative potential of Large Language Models in simplifying and enhancing the Common Domain Model. By bridging the gap between complex technical frameworks and user comprehension, these initiatives are making the CDM more accessible, fostering greater interoperability, and driving efficiency across the financial industry.
For those interested in exploring this initiative further, you can read more in the LLM Exploration Project GitHub repository. We also invite you to join our weekly calls held every Thursday, where we discuss ongoing developments and opportunities for collaboration. You can find the meeting details and add the event to your calendar here.
Stay tuned for upcoming developments and opportunities to get involved in this groundbreaking initiative. Your participation and expertise are invaluable as we work together to shape the future of finance through innovative AI solutions.
Authors: Luca Borella, Karl Moll