By Louise de Leyritz from CastorDoc
More data, more tools, more people = more data catalogs
Companies are deploying their analytics to more people in the company. Now, regardless of data literacy, most departments of large companies are using data. For that reason, there's a need to improve trust and understanding in data resources and infrastructure.
This explains the recent explosion in the past two years of data catalogs (internal, open-source, and SaaS). This new trend is not going to stop, and we'd rather bring visibility and structure soon.
At Castor, we believe the first step to structure the data catalog market, is more transparency. For that reason, we put up a list of all the catalog tools we heard of.
Proposed Dimensions
- Specialization: Data Catalog Only Tools vs Data Catalogs integrated into a larger offering
- Optimized for: Mid-Market vs Enterprise
- Main Use-Case: Compliance vs Analysts Productivity
Actor | Classification | Optimized for | Main Use-Cases | search | Context | Data quality | collaboration | table lineage | column lineage | Query history | Governance & security | BI tool indexation | pricing |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Castor | Data Catalog Only | Mid-Market | Analysts Productivity | $$ | |||||||||
SelectStar | Data Catalog Only | Mid-Market | Analysts Productivity | $$ | |||||||||
Data Galaxy | Data Catalog Only | Mid-Market | Compliance | contact only | |||||||||
Secoda | Data Catalog Only | Mid-Market | Analysts Productivity | $ | |||||||||
Stemma | Data Catalog Only | Mid-Market | Analysts Productivity | $$ | |||||||||
Tree schema | Data Catalog Only | Mid-Market | Analysts Productivity | $ | |||||||||
Atlan | Data Catalog Only | Mid-Market | Analysts Productivity | $$$ | |||||||||
Amundsen | Data Catalog Only | Mid-Market | Analysts Productivity | contact only | |||||||||
Collibra | Data Catalog Only | Enterprise | Compliance | $$$ | |||||||||
Alation | Data Catalog Only | Enterprise | Compliance | $$$ | |||||||||
Dataedo | Data Catalog Only | Enterprise | Compliance | $$ | |||||||||
Zeenea | Data Catalog Only | Enterprise | Compliance | $$$ | |||||||||
OvalEdge | Data Catalog Only | Enterprise | Compliance | $$ | |||||||||
Apache Atlas | Data Catalog Only | Enterprise | Compliance | contact only | |||||||||
Data.world | Data Catalog Only | Enterprise | Analysts Productivity | $$ | |||||||||
Metaphor | Data Catalog Only | Enterprise | Analysts Productivity | contact only | |||||||||
Datahub | Data Catalog Only | Enterprise | Analysts Productivity | contact only | |||||||||
Datalogz | Data Catalog Only | Mid-Market | Analysts Productivity | $ | |||||||||
OpenMetadata | Data Catalog Only | ||||||||||||
Prequel | Catalog is part of Larger Offering | Mid-Market | Analysts Productivity | contact only | |||||||||
Monte Carlo | Catalog is part of Larger Offering | Mid-Market | Engineers | contact only | |||||||||
Datakin | Catalog is part of Larger Offering | Mid-Market | Engineers | contact only | |||||||||
Transform | Catalog is part of Larger Offering | Mid-Market | Analysts Productivity | contact only | |||||||||
Google Data Catalog | Catalog is part of Larger Offering | Enterprise | Compliance | $ | |||||||||
Immuta | Catalog is part of Larger Offering | Enterprise | Compliance | contact only | |||||||||
Tableau data catalog | Catalog is part of Larger Offering | Enterprise | Analysts Productivity | $$ | |||||||||
IMB data catalog | Catalog is part of Larger Offering | Enterprise | Compliance | $$$ | |||||||||
Alteryx | Catalog is part of Larger Offering | Enterprise | Compliance | contact only | |||||||||
Aginity | Catalog is part of Larger Offering | Enterprise | Compliance | contact only | |||||||||
Cambridge analytics - Anzo | Catalog is part of Larger Offering | Enterprise | Compliance | $$ | |||||||||
Erwin | Catalog is part of Larger Offering | Enterprise | Compliance | contact only | |||||||||
Alex solutions | Catalog is part of Larger Offering | Enterprise | Compliance | contact only | |||||||||
Cloudera | Catalog is part of Larger Offering | Enterprise | Compliance | contact only | |||||||||
Infogix | Catalog is part of Larger Offering | Enterprise | Compliance | contact only | |||||||||
Informatica | Catalog is part of Larger Offering | Enterprise | Compliance | $$ | |||||||||
Oracle data catalog | Catalog is part of Larger Offering | Enterprise | Compliance | contact only | |||||||||
Qlik | Catalog is part of Larger Offering | Enterprise | Compliance | $$ | |||||||||
Talend | Catalog is part of Larger Offering | Enterprise | Compliance | $$ | |||||||||
Denodo | Catalog is part of Larger Offering | Enterprise | Compliance | $$$ | |||||||||
truedat | Catalog is part of Larger Offering | Enterprise | Compliance | contact only | |||||||||
SAP data intelligence | Catalog is part of Larger Offering | Enterprise | Compliance | contact only | |||||||||
Unifi | Catalog is part of Larger Offering | Enterprise | Compliance | $$ | |||||||||
Zaloni | Catalog is part of Larger Offering | Enterprise | Compliance | contact only | |||||||||
Ataccama | Catalog is part of Larger Offering | Enterprise | Compliance | contact only | |||||||||
MyDataCatalogue | |||||||||||||
Decube.io |
**This is a brief attempt at classifying the tools on the market. If anything seems wrong, the feature list seems off, or if you don't see your data catalog and want to have it placed, please reach out: louise@castordoc.com
F.A.Q
Do You Need a Data Catalog?
If you're having trouble finding the data; A data catalog is a tool that brings together information, from different data sources making it easier for users to search, discover and access the specific data they require. Without a catalog users may waste time navigating through databases and platforms in order to locate the datasets they need.
If you're unsure which datasets to utilize; A data catalog often provides features like data quality scores, user reviews and additional annotations. These features assist users in identifying relevant datasets that align with their goals leading to improved decision making and analytical outcomes.
If you have too many data sources at your disposal; In organizations data is scattered across various locations such as, on premises databases, cloud storage systems or third party platforms. A data catalog consolidates metadata from all these sources into a view making it easier for users to explore all data options and select the most suitable source based on their requirements.
If your data environment has never been properly documented it can lead to chaos and inefficiency. Having a data catalog is crucial as it not helps organize your data but also ensures documentation. It stores information, about data lineage, owners and definitions enabling everyone in the organization to have an understanding of the origin, purpose and characteristics of each dataset.
In case you need to comply with data regulations such as GDPR, CCPA or others it becomes essential to have an understanding of where personal data's stored how its utilized and who has access, to it. A data catalog can track this metadata, making it easier for organizations to demonstrate compliance and ensuring that sensitive data is handled appropriately.