Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: <Entity Extraction Question> #729

Closed
1 of 2 tasks
Bai1026 opened this issue Jul 26, 2024 · 3 comments
Closed
1 of 2 tasks

[Issue]: <Entity Extraction Question> #729

Bai1026 opened this issue Jul 26, 2024 · 3 comments
Labels
autoresolved awaiting_response Maintainers or community have suggested solutions or requested info, awaiting filer response stale Used by auto-resolve bot to flag inactive issues

Comments

@Bai1026
Copy link

Bai1026 commented Jul 26, 2024

Is there an existing issue for this?

  • I have searched the existing issues
  • I have checked #657 to validate if my issue is covered by community support

Describe the issue

Actually nothing is wrong, but while I was testing the graphRAG with my own chat log dataset, it could successfully answer the question I want (e.g. give some most used word of Vincent in the chat log).

And then I go to check the graphml and parquet files, I could not find these entities, these entities is like some specific word (something like damn, oops... but in Chinese).

I am wondering how is it possible if the entities is not extracted but the local search could answer the questions successfully.

Steps to reproduce

No response

GraphRAG Config Used

# Paste your config here

Logs and screenshots

No response

Additional Information

  • GraphRAG Version:
  • Operating System:
  • Python Version:
  • Related Issues:
@Bai1026 Bai1026 added the triage Default label assignment, indicates new issue needs reviewed by a maintainer label Jul 26, 2024
@natoverse
Copy link
Collaborator

In the response, the LLM should print citations for the entities and relationships used (e.g., "Entities(id1,id2,...)"). You can look up those entities in the create_final_entities.parquet and match on the human readable id field.

So: can you check your outputs and see if this aligns? If you have an entity id cited that is not in the parquet, this may be a hallucination example.

Otherwise: We try with our prompts to force the LLM to only rely on the supplied entity list to answer the question, but it is possible that it is drawing on its training to answer the question.

You also mention that the language is Chinese - we are tracking non-English with a consolidated issue; folks have various comments on how to get better results: #696

@natoverse natoverse added awaiting_response Maintainers or community have suggested solutions or requested info, awaiting filer response and removed triage Default label assignment, indicates new issue needs reviewed by a maintainer labels Jul 26, 2024
Copy link

github-actions bot commented Aug 3, 2024

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

@github-actions github-actions bot added the stale Used by auto-resolve bot to flag inactive issues label Aug 3, 2024
Copy link

This issue has been closed after being marked as stale for five days. Please reopen if needed.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
autoresolved awaiting_response Maintainers or community have suggested solutions or requested info, awaiting filer response stale Used by auto-resolve bot to flag inactive issues
Projects
None yet
Development

No branches or pull requests

2 participants