Garment Labels
The Challenge: Transcribing garment labels for product copy is a time-intensive, complicated bottleneck that limits how quickly e-commerce organizations can get product description pages (PDP) live.
The Solution: Automated data extraction from a single garment label image, utilizing machine learning (ML) models trained on both standardized and unique label iconography that delivers necessary information in moments.
The amount of variations in garment labels and care labels is pretty impressive. Actually, much more so than initially thought.
As we want to make life easier for our customers, we set out to lighten the workload of data entry and ended up learning a lot along the way. Also, we learned to take better care of our own clothes as a side effect.
Icons are one thing (or many)
In the initial steps of this project, we investigated what care icons exist, and started to work on training a model for these. But it turned out that a lack of enough data, combined with a lot of intricacies, made this a less than straightforward process…
A standard is a standard until it is not anymore. For example, there are icons specific to the EU and to the US. Same meaning, different icon. Sometimes a temperature will be displayed in degrees Celcius, and sometimes indicated by dots. Sometimes the same icon, but turned 90 degrees, will have a new meaning, and so the context of the icons will matter endlessly.
What we ended up doing was actually a three-step solution.
- Find the icons in the image of the label
- What kind of icon is it (based on the outside shape), wash, dry, etc.
- Looking at what is inside the icon that we now know is a washing icon, for example, can we find numbers or other details?
Combining these three, we end up with very high precision and a very flexible system! But, it’s not just about the icons…
What about all that text?
When we get to the text parts of the labels, things get even less structured. There are usually a host of languages present, and the data specified in the text can range from just the material composition and country of origin to serial numbers, alternate care instructions, and much more.
So after extracting the data from the text, the interesting part starts. We are structuring everything and figuring out if this is new information or a new language telling us something we already know. Sometimes there will even be care instructions, but written in ways that are not standardized to the icons, which means we have to dive deeper and extract the meaning of the instruction and compare that to what we learned from reading the icons…
JSON output from label scan, ready to populate parameters in Creative Force
All this information is packaged up neatly and returned in JSON format, including the original data, so we can make sure our customers have access to everything they need, and we can feed it into our AI Copywriting tool as additional information when writing copy from product images.