Blog 8
April 26, 2025
This week, I completed a major milestone for my RAG project by embedding the full patent text and image data from 2020 to 2025. I parsed all gazette files year by year, organized the data into structured CSV files, and embedded the text descriptions into FAISS indexes using Sentence Transformers. Each year now has its own separate FAISS index and metadata file, keeping the data modular and searchable.
On the image side, I initially faced challenges embedding the drawings due to performance bottlenecks with the slow CLIP image processor. After resolving dependency issues and installing Torchvision, I upgraded to the fast CLIP processor, which significantly improved embedding speed. I also leveraged batching and utilized my M3 Pro’s MPS (Metal Performance Shaders) hardware acceleration to process a large volume of images efficiently.
In total, I processed and embedded approximately 97,000+ images, creating a separate set of image embeddings and metadata files. Now, both text and image embeddings are complete for all six years, ready to be integrated into the final RAG pipeline. Next week, I plan to work on the retrieval and generation components to demonstrate full end-to-end functionality.
Leave a Reply
You must be logged in to post a comment.