Community Blog

Community Blog

Ushering a New Era of Industry-wide Data Standardization (Announcing the Legend + Delta Lake Integration)

November 09, 2022
In late 2019 when Goldman Sachs announced its intention to contribute the Legend platform to FINOS, the news quickly made the rounds across the financial services industry.

The first, powerful hint at the benefits that all industry constituents, including regulators, could derive from this new platform and the open data model standardization it can foster, could be found in the Legend Studio Pilot FINOS ran in the early months of 2020. This pilot, which brought together data modelers from the sell-side, buy-side, regtech firms and industry associations like ISDA to collaborate on a neutrally hosted instance of Legend Studio to propose community-agreed amendments to the ISDA Common Domain Model which got approved by ISDA in record time. Heck, we even won an award for this effort.

When the Alloy platform was fully open sourced, and contributed to FINOS under the new brand Legend, beyond the initial overwhelmingly positive response (here, here and here), there could have been legitimate doubts of actual industry-wide adoption. Especially given the size and the inherent complexity of the platform: think years and years of production-grade, battle-tested software, including its own underlying representation language, Pure.

But that’s where the open source equation came into play - a disruptive differentiator rarely effectively utilized in financial services to date, and honestly, something that even the newer, self-proclaimed tech-savvier industry participants - yes, fintechs I’m looking at you - should consider much more of a core pillar of their go-to-market strategies.

Open Source is a multiplier. As an individual or a corporate, when open source is done right, you aim to get back a directly proportional amount of value with respect to what you put in. If the project aims to solve a real need (data lock-in & interoperability in FSI, anyone?) for a broad set of constituents, and if you give the right amount of love (aka sweat equity), chances are, you’ll see contributors and adopters being attracted to participate.

And that’s where the initial team of maintainers from Goldman Sachs really stepped up:

First off, they took a transparent, open-first approach, meaning development happens directly in the open and software is later consumed from public repositories. Secondly, they committed resources to maintain the platform, with several Legend maintainers in the FINOS-wide top 20 committers leaderboard. Thirdly, they invested in building substantial documentation and educational material for both developer and data modeler contributors. Lastly, they went as far as supporting the FINOS team in hosting a Legend sandbox which is openly accessible for contributors to familiarize themselves with the platform. This sandbox has powered multiple open source data modeling efforts under the Financial Objects Special Interest Group (SIG), co-led by ISDA and Goldman Sachs with broad industry participation.

The importance of this approach cannot be understated: an openly governed modeling process, based on an openly available modeling platform, on a completely open source stack and representation language. This has effectively given data modelers (and “business folks”) from the likes of Deutsche Bank, Morgan Stanley, RBC, and more the potential to become first class citizens in our open source community. If you truly want to understand Legend’s potential to harmonize data within and across organizations, take a look at Beeke-Marie Nelke, Legend maintainer, and Ffion Ackland, Financial Objects SIG Chair, at the Open Source Strategy Forum 2021.

It is no surprise - still extremely exciting to community builders like us - that, less than two years later, we see not only a substantial adoption, with platform-wide downloads up 330% in Maven Central, 276% for Studio in the last 15 months, and Github stars up almost 1000% in the last 2 years, but a flurry of integrations being built and contributed to the platform. Crucially, many of these do NOT originate from the original contributors, Goldman Sachs. Instead we are seeing a cross-pollination and corporate diversity of contributors, which is a critical factor of the health of an open source project. We are seeing contributions from the likes of Canonical, SUSE, Cloudbase, and a healthy amount of individual contributors. Heck, there’s even an integration with Morphir, the FINOS project maintained by Morgan Stanley.

Today, we are thrilled to unveil an exciting new contribution to the Legend ecosystem by FINOS Silver Member Databricks, a leading data and analytics platform with deep roots in open source and standards: Legend Delta is a fully open source integration between Legend and Delta Lake, the open source platform for building data lakehouses also hosted under the Linux Foundation.

This has the potential to open a brand new chapter in the history of Legend. With the harmonized view of raw data sources Legend can provide through its models, users spend less time in stitching and integrating, and more time in actually delivering actual value with the data. From making your regulatory reporting more efficient, to truly delivering innovation to your institution by seamlessly enabling the complex type of AI workloads powered by Delta Lake. But don’t take it from me, hear it directly from Antoine Amend, maintainer of the Legend Delta project, during his keynote at OSSF 2021. The integration is currently not (yet) available in the FINOS Legend hosted instance, but reach out to legend@finos.org if you are interested to see it in action.

What started as 6 repositories on Github in November 2020 is now a 21 modules strong fully-fledged open source data modeling platform. What started as a proprietary internal software became a potential open source foundation for enterprise data governance in financial institutions. What started as a platform deployed within a single firm is now a cloud-hosted neutral venue welcoming vendors, industry associations and even competitive financial institutions to standardize their data front to back, leveraging an open source governance model provided by FINOS to come to consensus. What started as a vision for broad data interoperability is coming to life with Legend standardized open source data models that could be used at all layers of the stack, from application level via standards like FDC3 to the BI / AI / ML pipelines with Legend Delta.

Open Standards are great, but they are brittle in nature. All it takes is for one participant in the value chain to - willingly or unwittingly - break the standard and the whole workflow fails. It’s not surprising that 

in an industry with massive interests, a high concentration of influence - and therefore vendor lock-in, and such complex interactions at play (think about a trade lifecycle), data fragmentation is still such a major challenge.

We think FINOS has a vital chance to disrupt that, augmenting open standards by collaborating on openly governed open source data models and SDKs to accelerate developer adoption. All of it on a fully open source stack with growing adoption in the financial industry.

Good news, it seems we are not the only ones believing in this vision as three major industry associations, ISDA, ISLA, ICMA, contributed their Common Domain Model to FINOS to be further developed and expanded in this new age of open collaborative data modeling.

Chief Data Officers, you’re welcome :)

 

Read the Goldman Sachs perspective from the Goldman Developer Blog here

Read the Databricks perspective here

 

Interested in FINOS open source projects? Click the link below to see how to get involved in the FINOS Community.

Get Involved

 

FINOS Good First Issues - Looking for a place to contribute? Take a look at good first issues across FINOS projects and get your feet wet in the FINOS community.

State of Open Source in Financial Services Report 2021 - Learn about what is really happening around open source in FSI.

This Week at FINOS Blog - See what is happening at FINOS each week.

FINOS Landscape - See our landscape of FINOS open source and open standard projects.

Community Calendar - Scroll through the calendar to find a meeting to join.

FINOS Slack Channels - The FINOS Slack provides our Community another public channel to discuss work in FINOS and open source in finance more generally.

Project Status Dashboard - See a live snapshot of our community contributors and activity.

Events - Check out our upcoming events or email marketing@finos.org if you'd like to partner with us or have an event idea.

FINOS Virtual "Meetups" Videos & Slides - See replays of our virtual "meetups" based around the FINOS Community and Projects since we can't all be in the same room right now.

FINOS Open Source in Finance Podcasts - Listen and subscribe to the first open source in fintech and banking podcasts for deeper dives on our virtual "meetup" and other topics.