Solved: Choosing the right tools for my startup

diegodamaceno · 02-23-2024 08:01 AM

Hello. I'm new to this community. 🙂

I'm in the ideation phase of a B2B product, but I have questions about which Google Cloud tools I would need to use to build my app.

My app: my intention is to build an application that can scan product labels and identify whether they follow the guidelines of regulatory bodies (e.g. FDA). This involves identifying texts and their sizes, identifying stamps and logos. The intention is that I can train my evaluation system with the regulatory body's rules, so that it can read the label and return insights to the user.

Well, that's basically the intention. I was researching and saw that Google's OCR is a start, but they have a few different solutions, such as Document AI and Cloud Vision, but I wasn't sure which of these would be right for my use case.

I would be very grateful if any community members could share their experience and guide me to the correct tool.

I hope I can do the same for the community in the near future. 🙂

Roderick

Hi @diegodamaceno,

Welcome to the Community! Excited to hear that you're interested in learning more about the Google Cloud ecosystem and looking forward to seeing you get get engaged across the forums. GCC is a community of communities. You'll notice that we have dedicated forum spaces for all kinds of topics including Data Analytics, AI/ML, no-code app development and Google Cloud Computing.

Take a look around and check out some of the conversations in progress and add your voice. As a startup, you're in the right place! There are also some great programs to look into over at startup.google.com when you get a chance. I am not an expert, but I can definitely explore some internal resources with you and help point you in the direction of the best Google Cloud tools to suit your project needs!

Here's a breakdown of the options and why they could be relevant:

Core Services

Cloud Vision API: This is a great starting point. It provides powerful image analysis features, including:

Optical Character Recognition (OCR): Excellent for extracting text from the product labels, including font size detection.
Object Detection: Identifies logos, stamps, and other visual elements.
Explicit Content Detection: Helps with flagging inappropriate content if that's a factor in certain regulations.

Document AI: If your product labels contain structured or semi-structured information (like tables or forms), Document AI is a more specialized tool. It's designed to:

Parse complex documents: Extract specific data fields as needed by the regulatory guidelines.
Custom parsers: You can train specialized parsers to focus on the particular elements required by your targeted regulatory bodies.

How to Decide:

To choose the most suitable one, consider these questions:

Label Format: Are the labels simple images or more complex documents with tables and structured data?
Specificity of Requirements: Do you need raw text extraction, or do you require extraction of specific data points in a structured format?

Complementary Services

Cloud Storage: You'll need storage for image uploads to feed into Cloud Vision or Document AI.
Natural Language Processing (NLP) API: After text extraction, NLP can analyze and categorize the text against the rules and guidelines you define.
BigQuery: For storing and analyzing large amounts of data if you plan to scale your application.

Workflow Suggestion

Upload Image: User uploads an image of a product label to your application.
Storage: Store the image in Cloud Storage.
Preprocessing: Consider image preprocessing (e.g., cropping, resizing) if needed, potentially using Cloud Functions.
Core Analysis: Use Cloud Vision API to extract text, identify visual elements, and obtain size info. Optionally, use Document AI if you need structured data extraction.
Insight Generation: Define logical rules based on the specific regulatory guidelines using your preferred programming language. Compare the analysis output from step 4 against these rules.
Results: Present insights to the user (approved, needs changes, etc.)

Note: Building a robust application to accurately interpret regulatory guidelines will likely involve iterative model training and fine-tuning.

Let me know if you'd like to dive deeper into any of these aspects or discuss some of the training strategies for your evaluation system!

Also, if you haven't already, take a look at the the Google Cloud Innovators Program. Some of the benefits include access to exclusive learning resources and other member's only opportunities to help you grow as Cloud Professional. Let me know if you need help navigating that!

Be sure to sure to check out our Learning & Certification Hub where members share best practices on preparing for certifications, stay up to date on what’s next with Google Cloud, and network with similar goals.

Keep us posted on your progress!

View solution in original post

Roderick