Re: Dialogflow CX datastore Agent Summarization is...

lijokjohn · 04-09-2024 01:15 PM

Hi Experts,

I have search and conversation app (chatbot) which uses a datastore (unstructured PDF document). 80% of time when I ask a question it just gives an answer like below. Only in certain instances the answer is elaborated and meaningful.

Answer vague and not summarized

Expected answer

What am I doing wrong. How can I make it work to give the detailed summarization always?

Custom summarization prompt

--------------------------------

Given the conversation between a Human and a AI assistant and a list of sources, write a final answer for the AI assistant.
Follow these guidelines:
+ Answer the Human's query and make sure you mention all relevant details from the sources, using exactly the same words as the sources if possible.
+ The answer must be based only on the sources and not introduce any additional information.
+ All numbers, like price, date, time or phone numbers must appear exactly as
they are in the sources.
+ Give as comprehensive answer as possible given the sources. Include all
important details, and any caveats and conditions that apply.
+ Do not just say refer to link below. Always provide a detailed summary in first response itself.
+ The answer MUST be in English.
+ Don't try to make up an answer: If the answer cannot be found in the sources, you admit that you don't know and you answer NOT_ENOUGH_INFORMATION.

Begin! Let's work this out step by step to be sure we have the right answer.

Sources: $sources
$conversation
Human: $original-query
AI:

--------------------------------

Thanks,

John

xavidop

hi!

quick questions:

1. did you play with grounding in the gen ai config?

2. did you test this with the default summarization prompt? what is the output?

Best,

Xavi

lijokjohn

Hi Xavi,

The default prompt behaves the same way. The issue with default prompt was that the answer was never detailed enough as we like.

Below is the grounding settings.

Thanks,

John

xavidop

Hi,

another recommendation is to choose another LLM, for example, Gemini. Can you please change that configuration in the Gen AI settings and keep me updated?

Best,

Xavi

lijokjohn

Hi Xavi,

I changed to Gemini. It still behaves the same. Is there anything else that can help ?

Thanks

John

xavidop

that is really weird, what are the debug logs? you will see all the generative responses from the data store.

Best,

Xavi

lijokjohn

Hi Xavi,

I see from the logs that the knowledge connector has all the necessary information/snippets to give a good answer. However it somehow chooses to say "You can check this link and find what you're looking for".

I am attaching the cloud log JSON for your reference. Cloud log JSON

I noticed the below in the JSON string

"name": "Parse ReAct Answer",
"status": {
"code": "INTERNAL_ERROR"
}

Not sure if this has something to do with the issue.

After few minutes I asked the same question again to the chatbot and it gave me the right answer. Below is the cloud log

Cloud log for correct answer

Thanks

John

xavidop

I am seeing a lot of weird characters, can you please add to the prompt to remove all the special characters?

Lets see how that goes!

Best,

Xavi

lijokjohn

Hi Xavi,

I added a line in the prompt as below. Is that what you meant?

Given the conversation between a Human and a AI assistant and a list of sources, write a final answer for the AI assistant.
Follow these guidelines:
+ Remove all special characters from the $sources before you analyze it
+ Answer the Human's query and make sure you mention all relevant details from the sources, using exactly the same words as the sources if possible.
+ The answer must be based only on the sources and not introduce any additional information.
+ All numbers, like price, date, time or phone numbers must appear exactly as
they are in the sources.
+ Give as comprehensive answer as possible given the sources. Include all
important details, and any caveats and conditions that apply.
+ Do not just say refer to link below. Always provide a detailed summary in first response itself.
+ The answer MUST be in English.
+ Don't try to make up an answer: If the answer cannot be found in the sources, you admit that you don't know and you answer NOT_ENOUGH_INFORMATION.

Begin! Let's work this out step by step to be sure we have the right answer.

Sources: $sources
$conversation
Human: $original-query
AI:

Thanks

John

xavidop

yes! lets try that

lijokjohn

Hi Xavi,

The issue persists even after the suggested prompt change. On the first try it worked fine but asking the same question again results in the same issue. I noticed in the JSON logs it says "response_reason": "NOT_GROUNDED".

JSON Log (prompt change to remove special characters)

Thanks,

John

xavidop

I am sorry, I do not know what is happening...

can you show me your Dialgoflow CX design? are you transitioning to another page or flow?

lijokjohn

Hi Xavi,

No transitioning. For the purpose of our testing, I created a simple app with just the start page

xavidop

Is the Data store indexed?

which parser configuration did you use when you were uploading the docs? did you enable chunks?

that will help a lot

Best,

Xavi

lijokjohn

Hi Xavi,

My data store is indexed basically from a set of PDFs in cloud store.

I used the Digital Parser.

I tried both, with chunking and without chunking. In both cases the issue exist.

The chatbot was functioning properly like a month back. At that time the datastore builder didn't had the option of chunking etc. Recently we had to recreate the datastore and started noticing the issues thereafter.

The new Agent based bots could have been an alternative we could try. But we cant use it because the agent based chatbots does not provide URIs of the documents from where it constructed the answers for user queries.

Thanks,

John

xavidop

oh wow! that is not good.

I will recommend opening an issue on issuetracker: https://cloud.google.com/support/docs/issue-trackers

After doing that, please share the link with me so I can track it!

Xavi

lijokjohn

Hi Xavi,

I will do that.

Meanwhile as a workaround I was thinking of using a generators for datastore fulfillment in dialogflow.

Do you know if we can somehow pass $Sources in a generator prompt ? I tried below but it does not recognize $sources in the generator prompts.

Thanks,

John

xavidop

yes you can:

$sys.func.GET_FIELD($sys.func.GET($request.knowledge.sources[0], 0), "uri")

$sys.func.GET_FIELD($sys.func.GET($request.knowledge.sources[0], 0), "title")

jon_17s

Hi John,

Did you already open the ticket and get the resolution from Google Cloud Support? If yes, could you please share the resolution?

Best Regards,
Jonatan

piyush_garg

Hey @lijokjohn ,

It is quite happening nowadays that if we ask the question for the first time it gives the answer and if we ask the same question 2nd time in the same session it does not answer instead it gives a fallback

we have tried one approach it is working for us
Please change the configuration and then try
Gemini to testbison002,

change grounding to medium

Best

Piyush Garg

Dialogflow CX datastore Agent Summarization issue