Conrad is one of Germany’s leading B2B and B2C online retailers in the technology sector. In order to keep up with the constantly growing competitive pressure in the e-commerce sector, Conrad decided in 2017 to offer products from other retailers in a newly created B2B marketplace in addition to its own product catalogue. However, the envisaged 50-fold expansion of the company’s own product range from 900,000 to 50 million product data records was not feasible with the existing STEP PIM system. Not only was there a new order of magnitude in terms of product data records, but near real-time product information updating was also required.
For an MVP of the new marketplace, Conrad’s own product data from the STEP platform and product data of the marketplace retailers from the Mirakl system were put into a uniform data structure. Especially due to the large differences in data quality between high-quality, curated Conrad product data records and deliveries from marketplace retailers, the decision was made to implement a customized data integration solution.
The Avantgarde Labs solution
Avantgarde Labs was engaged to create a cloud-based data integration solution that not only consolidates the two source systems of STEP and Mirakl, but also provides near-real-time product data availability to the web shop and other target systems, such as the digital price tags in Conrad stores. The first step in this integration solution was the interface-based connection of the two source systems as well as the storage and scalable retrieval of the data via a MongoDB. An intelligent change-data capture system was developed to process complete source-data pools of any size at regular intervals. The service for detecting changes in product data enables the import process to be accelerated many times over, as only changed or new data needs to be processed. This means changes in the product data can be delivered to customers more quickly and errors in the web shop are corrected more promptly.
Another implementation challenge was the detection of duplicate products, both of the individual retailers themselves and of the retailers in relation to each other. Despite BMEcat restrictions on product information, each supplier maintains its product data differently. However, the uniform data structure created by Mirakl was not sufficient to merge the individual product data records into one golden record. An additional module was developed for this purpose, which identifies the related data records on the basis of various characteristics, and this reduced the product catalogue from 18 million to 13 million different products.
Project results & customer benefits
The cloud-based integration solution provides customers with highly up-to-date and correct information at all times. On the one hand, this means an improved shopping experience and at the same time reduces legal risks by implementing error corrections in the end customer systems as quickly as possible. Retailers in the marketplace also benefit from the high-performance data provision. Changes to their product data they report are visible in the web shop within a maximum of 2 hours.
Automatically generated reports based on the developed deduplication and change-data-capture algorithms provide retailers with a deeper insight into their own data quality in the event of missing product information. Conrad and the retailers benefit from an increase in sales with the solution created by Avantgarde Labs, while Conrad’s customers find an expanded offer of products with an improved shopping experience in the web shop.