How I used RAG, AWS Bedrock, dltHub and Pinecone to Explore UK Property Data

Rakesh Gupta
2 min readDec 30, 2024

--

I recently started exploring the idea of building an AI agent focused on UK government property datasets. This stemmed from an evening gathering with a few friends, where we casually discussed UK property prices, property types, and housing shortages.

While creating Power BI visuals based on datasets was straightforward for me, I wanted to experiment with building something even simpler — something that wouldn’t require me to create or update visuals manually based on queries.
Foundation models were capable of answering many queries, but I wanted to develop a solution specifically tailored to UK property pricing and property statistics, addressing the kinds of questions my friends and I often had.

This led me to explore RAG (Retrieval-Augmented Generation) architecture, creating a knowledge base from the last 10 years of data to enable querying of these specific datasets.

I experimented with Amazon Web Services (AWS) Bedrock, Streamlit, DuckDB, dltHub, and Pinecone (the free tier was sufficient for my needs). With just a few code adjustments, I was able to generate visuals (primarily bar charts) and provide explanations based on user queries. It was fascinating to see the results!

Key Learnings:

(1) Context-Aware Data is Essential: Simply feeding raw data into your GenAI app won’t make it efficient. You need to work with contextually relevant data, which involves understanding the entire data engineering process: from ingestion to transformation, ensuring data quality, and building observability. These steps are crucial to developing true GenAI RAG-based applications.

(2) Semantic Data is Key: Transformed data, or what I call “semantic data” (data aligned with the context of a specific business problem), is critical for success.

(3) Understanding Models and Techniques: A solid understanding of embedding models, tokenization, and the capabilities of various foundation models is essential.

This also means, generating data visuals are getting automated and I am quite optimistic that developing complex data visuals will be made so easy using GenAI in the coming future.

End to end RAG Q&A app architecture using AWS Bedrock
Q&A and autogenerating visuals
Context aware Q&A

--

--

Rakesh Gupta
Rakesh Gupta

Written by Rakesh Gupta

Founder and IT Consultant, SketchMyView (www.sketchmyview.com). Reach me here: linkedin.com/in/grakeshk

No responses yet