Thiết kế website giá rẻ

Question

I’m encountering an issue when trying to retrieve the top K base nodes using the RecursiveRetriever from LlamaIndex. Instead of retrieving only the top K base nodes, it retrieves from all top K nodes (including base and reference nodes) and then selects unique base nodes from this list.

I’ve followed the notebook example but modified it slightly for my use case. Here is a simplified code snippet:

<code>import copy

import os

import nest_asyncio

nest_asyncio.apply()

from llama_index.core import Document, VectorStoreIndex, Settings

from llama_index.core.node_parser import SentenceSplitter

from llama_index.core.schema import IndexNode, TextNode

from llama_index.core.extractors import (

SummaryExtractor,

QuestionsAnsweredExtractor,

)

from llama_index.llms.openai import OpenAI

from llama_index.llms.azure_openai import AzureOpenAI

from llama_index.core.embeddings import resolve_embed_model

def main():

embed_model = resolve_embed_model("local:BAAI/bge-small-en")

introduction_llama2 = """Introduction Large Language Models (LLMs) have shown great promise as

highly capable AI assistants that excel in complex reasoning tasks requiring expert knowledge

across a wide range of fields, including in specialized domains such as programming and

creative writing. They enable interaction with humans through intuitive chat interfaces,

which has led to rapid and widespread adoption among the general public.

The capabilities of LLMs are remarkable considering the seemingly straightforward nature of the

training methodology. Auto-regressive transformers are pretrained on an extensive corpus of

self-supervised data, followed by alignment with human preferences via techniques such as

Reinforcement Learning with Human Feedback (RLHF). Although the training methodology is simple,

high computational requirements have limited the development of LLMs to a few players. There

have been public releases of pretrained LLMs (such as BLOOM (Scao et al., 2022), LLaMa-1

(Touvron et al., 2023), and Falcon (Penedo et al., 2023)) that match the performance of closed

pretrained competitors like GPT-3 (Brown et al., 2020) and Chinchilla (Hoffmann et al., 2022)

but none of these models are suitable substitutes for closed “product” LLMs, such as ChatGPT,

BARD, and Claude. These closed product LLMs are heavily fine-tuned to align with human

preferences, which greatly enhances their usability and safety. This step can require

significant costs in compute and human annotation, and is often not transparent or easily

reproducible, limiting progress within the community to advance AI alignment research.

In this work, we develop and release Llama 2, a family of pretrained and fine-tuned LLMs,

Llama 2 and Llama 2-Chat, at scales up to 70B parameters. On the series of helpfulness and

safety benchmarks we tested, Llama 2-Chat models generally perform better than existing open

source models. They also appear to be on par with some of the closed-source models, at least on

the human evaluations we performed (see Figures 1 and 3). We have taken measures to increase

the safety of these models, using safety-specific data annotation and tuning, as well as

conducting red-teaming and employing iterative evaluations. Additionally, this paper

contributes a thorough description of our fine-tuning methodology and approach to improving

LLM safety. We hope that this openness will enable the community to reproduce fine-tuned LLMs

and continue to improve the safety of those models, paving the way for more responsible

development of LLMs. We also share novel observations we made during the development of Llama 2

and Llama 2-Chat, such as the emergence of tool usage and temporal organization of knowledge.

Figure 3: Safety human evaluation results for Llama 2-Chat compared to other open-source and

closed-source models. Human raters judged model generations for safety violations across ~2,000

adversarial prompts consisting of both single and multi-turn prompts. More details can be found

in Section 4.4. It is important to caveat these safety results with the inherent bias of LLM

evaluations due to limitations of the prompt set, subjectivity of the review guidelines, and

subjectivity of individual raters. Additionally, these safety evaluations are performed using

content standards that are likely to be biased towards the Llama 2-Chat models.

We are releasing the following models to the general public for research and commercial use‡:

1. Llama 2, an updated version of Llama 1, trained on a new mix of publicly available data. We

also increased the size of the pretraining corpus by 40%, doubled the context length of the

model, and adopted grouped-query attention (Ainslie et al., 2023). We are releasing variants of

Llama 2 with 7B, 13B, and 70B parameters. We have also trained 34B variants, which we report on

in this paper but are not releasing.§2. Llama 2-Chat, a fine-tuned version of Llama 2 that is

optimized for dialogue use cases. We release variants of this model with 7B, 13B, and 70B

parameters as well.

We believe that the open release of LLMs, when done safely, will be a net benefit to society.

Like all LLMs, Llama 2 is a new technology that carries potential risks with use (Bender et

al., 2021b; Weidinger et al., 2021;

Solaiman et al., 2023). Testing conducted to date has been in English and has not — and could

not — cover all scenarios. Therefore, before deploying any applications of Llama 2-Chat,

developers should perform safety testing and tuning tailored to their specific applications of

the model. We provide a responsible use guide and code examples to facilitate the safe

deployment of Llama 2 and Llama 2-Chat. More details of our responsible release strategy can be

found in Section 5.3.

The remainder of this paper describes our pretraining methodology (Section 2), fine-tuning

methodology (Section 3), approach to model safety (Section 4), key observations and insights

(Section 5), relevant related work (Section 6), and conclusions (Section 7).

‡https://ai.meta.com/resources/models-and-libraries/llama/§We are delaying the release of the

34B model due to a lack of time to sufficiently red team.

https://ai.meta.com/llama‖https://github.com/facebookresearch/llama

Figure 4: Training of Llama 2-Chat: This process begins with the pretraining of Llama 2 using

publicly available online sources. Following this, we create an initial version of Llama 2-Chat

through the application of supervised fine-tuning. Subsequently, the model is iteratively

refined using Reinforcement Learning with Human Feedback (RLHF) methodologies, specifically

through rejection sampling and Proximal Policy Optimization (PPO). Throughout the RLHF stage,

the accumulation of iterative reward modeling data in parallel with model enhancements is

crucial to ensure the reward models remain within distribution."""

# Make sure to set your OpenAI API key

# os.environ["OPENAI_API_KEY"] = "your-openai-api-key"

docs = [Document(text=introduction_llama2)]

node_parser = SentenceSplitter(chunk_size=256)

base_nodes = node_parser.get_nodes_from_documents(docs)

for idx, node in enumerate(base_nodes):

node.id_ = f"node-{idx}"

extractors = [

SummaryExtractor(summaries=["self"], show_progress=True),

QuestionsAnsweredExtractor(questions=2, show_progress=True),

]

all_nodes = copy.deepcopy(base_nodes)

node_to_metadata = {}

for extractor in extractors:

# when you get the check whether you parse the nodes as a list

metadata_dicts = extractor.extract(base_nodes)

for node, metadata in zip(base_nodes, metadata_dicts):

if node.node_id not in node_to_metadata:

node_to_metadata[node.node_id] = metadata

else:

node_to_metadata[node.node_id].update(metadata)

vector_index_metadata = VectorStoreIndex(all_nodes)

vector_retriever_metadata = vector_index_metadata.as_retriever(

similarity_top_k=4

)

# Dictionary of all nodes for retrieval

all_nodes_dict = {n.id_: n for n in all_nodes}

# Initialize RecursiveRetriever

retriever_metadata = RecursiveRetriever(

"vector",

retriever_dict={"vector": vector_retriever_metadata},

node_dict=all_nodes_dict,

verbose=False,

)

nodes = retriever_metadata.retrieve(

"What is the purpose of this paper?"

)

return nodes

if __name__ == "__main__":

main()

</code>

<code>import copy import os import nest_asyncio nest_asyncio.apply() from llama_index.core import Document, VectorStoreIndex, Settings from llama_index.core.node_parser import SentenceSplitter from llama_index.core.schema import IndexNode, TextNode from llama_index.core.extractors import ( SummaryExtractor, QuestionsAnsweredExtractor, ) from llama_index.llms.openai import OpenAI from llama_index.llms.azure_openai import AzureOpenAI from llama_index.core.embeddings import resolve_embed_model def main(): embed_model = resolve_embed_model("local:BAAI/bge-small-en") introduction_llama2 = """Introduction Large Language Models (LLMs) have shown great promise as highly capable AI assistants that excel in complex reasoning tasks requiring expert knowledge across a wide range of fields, including in specialized domains such as programming and creative writing. They enable interaction with humans through intuitive chat interfaces, which has led to rapid and widespread adoption among the general public. The capabilities of LLMs are remarkable considering the seemingly straightforward nature of the training methodology. Auto-regressive transformers are pretrained on an extensive corpus of self-supervised data, followed by alignment with human preferences via techniques such as Reinforcement Learning with Human Feedback (RLHF). Although the training methodology is simple, high computational requirements have limited the development of LLMs to a few players. There have been public releases of pretrained LLMs (such as BLOOM (Scao et al., 2022), LLaMa-1 (Touvron et al., 2023), and Falcon (Penedo et al., 2023)) that match the performance of closed pretrained competitors like GPT-3 (Brown et al., 2020) and Chinchilla (Hoffmann et al., 2022) but none of these models are suitable substitutes for closed “product” LLMs, such as ChatGPT, BARD, and Claude. These closed product LLMs are heavily fine-tuned to align with human preferences, which greatly enhances their usability and safety. This step can require significant costs in compute and human annotation, and is often not transparent or easily reproducible, limiting progress within the community to advance AI alignment research. In this work, we develop and release Llama 2, a family of pretrained and fine-tuned LLMs, Llama 2 and Llama 2-Chat, at scales up to 70B parameters. On the series of helpfulness and safety benchmarks we tested, Llama 2-Chat models generally perform better than existing open source models. They also appear to be on par with some of the closed-source models, at least on the human evaluations we performed (see Figures 1 and 3). We have taken measures to increase the safety of these models, using safety-specific data annotation and tuning, as well as conducting red-teaming and employing iterative evaluations. Additionally, this paper contributes a thorough description of our fine-tuning methodology and approach to improving LLM safety. We hope that this openness will enable the community to reproduce fine-tuned LLMs and continue to improve the safety of those models, paving the way for more responsible development of LLMs. We also share novel observations we made during the development of Llama 2 and Llama 2-Chat, such as the emergence of tool usage and temporal organization of knowledge. Figure 3: Safety human evaluation results for Llama 2-Chat compared to other open-source and closed-source models. Human raters judged model generations for safety violations across ~2,000 adversarial prompts consisting of both single and multi-turn prompts. More details can be found in Section 4.4. It is important to caveat these safety results with the inherent bias of LLM evaluations due to limitations of the prompt set, subjectivity of the review guidelines, and subjectivity of individual raters. Additionally, these safety evaluations are performed using content standards that are likely to be biased towards the Llama 2-Chat models. We are releasing the following models to the general public for research and commercial use‡: 1. Llama 2, an updated version of Llama 1, trained on a new mix of publicly available data. We also increased the size of the pretraining corpus by 40%, doubled the context length of the model, and adopted grouped-query attention (Ainslie et al., 2023). We are releasing variants of Llama 2 with 7B, 13B, and 70B parameters. We have also trained 34B variants, which we report on in this paper but are not releasing.§2. Llama 2-Chat, a fine-tuned version of Llama 2 that is optimized for dialogue use cases. We release variants of this model with 7B, 13B, and 70B parameters as well. We believe that the open release of LLMs, when done safely, will be a net benefit to society. Like all LLMs, Llama 2 is a new technology that carries potential risks with use (Bender et al., 2021b; Weidinger et al., 2021; Solaiman et al., 2023). Testing conducted to date has been in English and has not — and could not — cover all scenarios. Therefore, before deploying any applications of Llama 2-Chat, developers should perform safety testing and tuning tailored to their specific applications of the model. We provide a responsible use guide and code examples to facilitate the safe deployment of Llama 2 and Llama 2-Chat. More details of our responsible release strategy can be found in Section 5.3. The remainder of this paper describes our pretraining methodology (Section 2), fine-tuning methodology (Section 3), approach to model safety (Section 4), key observations and insights (Section 5), relevant related work (Section 6), and conclusions (Section 7). ‡https://ai.meta.com/resources/models-and-libraries/llama/§We are delaying the release of the 34B model due to a lack of time to sufficiently red team. https://ai.meta.com/llama‖https://github.com/facebookresearch/llama Figure 4: Training of Llama 2-Chat: This process begins with the pretraining of Llama 2 using publicly available online sources. Following this, we create an initial version of Llama 2-Chat through the application of supervised fine-tuning. Subsequently, the model is iteratively refined using Reinforcement Learning with Human Feedback (RLHF) methodologies, specifically through rejection sampling and Proximal Policy Optimization (PPO). Throughout the RLHF stage, the accumulation of iterative reward modeling data in parallel with model enhancements is crucial to ensure the reward models remain within distribution.""" # Make sure to set your OpenAI API key # os.environ["OPENAI_API_KEY"] = "your-openai-api-key" docs = [Document(text=introduction_llama2)] node_parser = SentenceSplitter(chunk_size=256) base_nodes = node_parser.get_nodes_from_documents(docs) for idx, node in enumerate(base_nodes): node.id_ = f"node-{idx}" extractors = [ SummaryExtractor(summaries=["self"], show_progress=True), QuestionsAnsweredExtractor(questions=2, show_progress=True), ] all_nodes = copy.deepcopy(base_nodes) node_to_metadata = {} for extractor in extractors: # when you get the check whether you parse the nodes as a list metadata_dicts = extractor.extract(base_nodes) for node, metadata in zip(base_nodes, metadata_dicts): if node.node_id not in node_to_metadata: node_to_metadata[node.node_id] = metadata else: node_to_metadata[node.node_id].update(metadata) vector_index_metadata = VectorStoreIndex(all_nodes) vector_retriever_metadata = vector_index_metadata.as_retriever( similarity_top_k=4 ) # Dictionary of all nodes for retrieval all_nodes_dict = {n.id_: n for n in all_nodes} # Initialize RecursiveRetriever retriever_metadata = RecursiveRetriever( "vector", retriever_dict={"vector": vector_retriever_metadata}, node_dict=all_nodes_dict, verbose=False, ) nodes = retriever_metadata.retrieve( "What is the purpose of this paper?" ) return nodes if __name__ == "__main__": main() </code>

import copy
import os
import nest_asyncio

nest_asyncio.apply()

from llama_index.core import Document, VectorStoreIndex, Settings
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.schema import IndexNode, TextNode
from llama_index.core.extractors import (
    SummaryExtractor,
    QuestionsAnsweredExtractor,
)
from llama_index.llms.openai import OpenAI
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.core.embeddings import resolve_embed_model

def main():

    embed_model = resolve_embed_model("local:BAAI/bge-small-en")

    introduction_llama2 = """Introduction Large Language Models (LLMs) have shown great promise as
    highly capable AI assistants that excel in complex reasoning tasks requiring expert knowledge
    across a wide range of fields, including in specialized domains such as programming and
    creative writing. They enable interaction with humans through intuitive chat interfaces, 
    which has led to rapid and widespread adoption among the general public.
    The capabilities of LLMs are remarkable considering the seemingly straightforward nature of the
    training methodology. Auto-regressive transformers are pretrained on an extensive corpus of
    self-supervised data, followed by alignment with human preferences via techniques such as
    Reinforcement Learning with Human Feedback (RLHF). Although the training methodology is simple,
    high computational requirements have limited the development of LLMs to a few players. There
    have been public releases of pretrained LLMs (such as BLOOM (Scao et al., 2022), LLaMa-1
    (Touvron et al., 2023), and Falcon (Penedo et al., 2023)) that match the performance of closed
    pretrained competitors like GPT-3 (Brown et al., 2020) and Chinchilla (Hoffmann et al., 2022)
    but none of these models are suitable substitutes for closed “product” LLMs, such as ChatGPT,
    BARD, and Claude. These closed product LLMs are heavily fine-tuned to align with human
    preferences, which greatly enhances their usability and safety. This step can require
    significant costs in compute and human annotation, and is often not transparent or easily
    reproducible, limiting progress within the community to advance AI alignment research.
    In this work, we develop and release Llama 2, a family of pretrained and fine-tuned LLMs, 
    Llama 2 and Llama 2-Chat, at scales up to 70B parameters. On the series of helpfulness and
    safety benchmarks we tested, Llama 2-Chat models generally perform better than existing open
    source models. They also appear to be on par with some of the closed-source models, at least on
    the human evaluations we performed (see Figures 1 and 3). We have taken measures to increase
    the safety of these models, using safety-specific data annotation and tuning, as well as
    conducting red-teaming and employing iterative evaluations. Additionally, this paper
    contributes a thorough description of our fine-tuning methodology and approach to improving
    LLM safety. We hope that this openness will enable the community to reproduce fine-tuned LLMs   
    and continue to improve the safety of those models, paving the way for more responsible
    development of LLMs. We also share novel observations we made during the development of Llama 2
    and Llama 2-Chat, such as the emergence of tool usage and temporal organization of knowledge.
    Figure 3: Safety human evaluation results for Llama 2-Chat compared to other open-source and
    closed-source models. Human raters judged model generations for safety violations across ~2,000
    adversarial prompts consisting of both single and multi-turn prompts. More details can be found
    in Section 4.4. It is important to caveat these safety results with the inherent bias of LLM
    evaluations due to limitations of the prompt set, subjectivity of the review guidelines, and
    subjectivity of individual raters. Additionally, these safety evaluations are performed using
    content standards that are likely to be biased towards the Llama 2-Chat models.
    We are releasing the following models to the general public for research and commercial use‡:
    1. Llama 2, an updated version of Llama 1, trained on a new mix of publicly available data. We
    also increased the size of the pretraining corpus by 40%, doubled the context length of the
    model, and adopted grouped-query attention (Ainslie et al., 2023). We are releasing variants of
    Llama 2 with 7B, 13B, and 70B parameters. We have also trained 34B variants, which we report on
    in this paper but are not releasing.§2. Llama 2-Chat, a fine-tuned version of Llama 2 that is
    optimized for dialogue use cases. We release variants of this model with 7B, 13B, and 70B
    parameters as well.
    We believe that the open release of LLMs, when done safely, will be a net benefit to society.
    Like all LLMs, Llama 2 is a new technology that carries potential risks with use (Bender et
    al., 2021b; Weidinger et al., 2021;
    Solaiman et al., 2023). Testing conducted to date has been in English and has not — and could
    not — cover all scenarios. Therefore, before deploying any applications of Llama 2-Chat,
    developers should perform safety testing and tuning tailored to their specific applications of
    the model. We provide a responsible use guide and code examples to facilitate the safe
    deployment of Llama 2 and Llama 2-Chat. More details of our responsible release strategy can be
    found in Section 5.3.
    The remainder of this paper describes our pretraining methodology (Section 2), fine-tuning
    methodology (Section 3), approach to model safety (Section 4), key observations and insights
    (Section 5), relevant related work (Section 6), and conclusions (Section 7).
    ‡https://ai.meta.com/resources/models-and-libraries/llama/§We are delaying the release of the
    34B model due to a lack of time to sufficiently red team.
    https://ai.meta.com/llama‖https://github.com/facebookresearch/llama
    Figure 4: Training of Llama 2-Chat: This process begins with the pretraining of Llama 2 using
    publicly available online sources. Following this, we create an initial version of Llama 2-Chat 
    through the application of supervised fine-tuning. Subsequently, the model is iteratively
    refined using Reinforcement Learning with Human Feedback (RLHF) methodologies, specifically
    through rejection sampling and Proximal Policy Optimization (PPO). Throughout the RLHF stage,
    the accumulation of iterative reward modeling data in parallel with model enhancements is
    crucial to ensure the reward models remain within distribution."""

    # Make sure to set your OpenAI API key
    # os.environ["OPENAI_API_KEY"] = "your-openai-api-key"

    docs = [Document(text=introduction_llama2)]

    node_parser = SentenceSplitter(chunk_size=256)
    base_nodes = node_parser.get_nodes_from_documents(docs)

    for idx, node in enumerate(base_nodes):
        node.id_ = f"node-{idx}"

    extractors = [
        SummaryExtractor(summaries=["self"], show_progress=True),
        QuestionsAnsweredExtractor(questions=2, show_progress=True),
    ]

    all_nodes = copy.deepcopy(base_nodes)
    node_to_metadata = {}            
    for extractor in extractors:
        # when you get the check whether you parse the nodes as a list
        metadata_dicts = extractor.extract(base_nodes)
        for node, metadata in zip(base_nodes, metadata_dicts):
            if node.node_id not in node_to_metadata:
                node_to_metadata[node.node_id] = metadata
            else:
                node_to_metadata[node.node_id].update(metadata)

    vector_index_metadata = VectorStoreIndex(all_nodes)

    vector_retriever_metadata = vector_index_metadata.as_retriever(
        similarity_top_k=4
    )

    # Dictionary of all nodes for retrieval
    all_nodes_dict = {n.id_: n for n in all_nodes}

    # Initialize RecursiveRetriever
    retriever_metadata = RecursiveRetriever(
        "vector",
        retriever_dict={"vector": vector_retriever_metadata},
        node_dict=all_nodes_dict,
        verbose=False,
    )

    nodes = retriever_metadata.retrieve(
        "What is the purpose of this paper?"
    )

    return nodes

if __name__ == "__main__":
    main()

Questions

How can I retrieve the top K (unique) base nodes effectively?
Should all nodes be IndexNode or should the base nodes be TextNode and the reference nodes IndexNode?

Expected vs. Actual Behavior
Expected Behavior: Retrieve the top K unique base nodes.
Actual Behavior: Returns a set less than or equal to K of unique base nodes.

Thiết kế website giá rẻ

Danh mục

How to Retrieve Top K Unique Base Nodes Using RecursiveRetriever in LlamaIndex?