使用 Elastic 和 Apple 的 OpenELM 模型构建 RAG 系统
2024年11月29日 | by mebius
作者:来自 ElasticGustavo Llermaly
如何部署和测试新的 Apple 模型并使用 Elastic 构建 RAG 系统。
在本文中,我们将学习部署和测试新的 Apple 模型,并构建一个 RAG 系统来模拟 Apple Intelligence,使用 Elastic 作为向量数据库,OpenELM 作为模型提供者。
4 月,Apple 发布了其开放高效语言模型 (OpenELM),其参数有 2.7 亿、4.5 亿、1.1 亿和 3 亿,包括聊天(chat)和指令(instruct)版本。参数较大的模型通常更适合执行复杂任务,但速度较慢且耗费更多资源,而参数较小的模型则速度更快、要求更低。选择取决于我们想要解决的问题。
用于生成和训练这些模型的框架 (CoreNet) 也已可用。
OpenELM 模型的优势之一是它们可以迁移到 MLX,MLX 是一个针对配备 Apple Silicon 处理器的设备优化的深度学习框架,因此它们可以通过为这些设备训练本地模型来从这项技术中受益。
Apple 刚刚发布了新款 iPhone,其中一项新功能是 Apple Intelligence,它利用 AI 来帮助完成通知分类、上下文感知推荐和电子邮件编写等任务。
让我们使用 Elastic 和 OpenELM 构建一个应用程序来实现相同的目标!
- 部署模型
- 索引数据
- 测试模型
git clone https://huggingface.co/apple/OpenELM
然后,你需要在此处获取 HuggingFace 访问 token。
接下来,你需要请求访问 HuggingFace 中的 Llama-2-7b 模型以使用 OpenELM 分词器。
python generate_openelm.py --model apple/OpenELM-270M-Instruct --hf_access_token [HF_ACCESS_TOKEN] --prompt 'Once upon a time there was' --generate_kwargs repetition_penalty=1.2 prompt_lookup_num_tokens=10
Once upon a time there was a man named John Smith. He had been born in the small town of Pine Bluff, Arkansas, and raised by his single mother, Mary Ann. John’s family moved to California when he was young, settling in San Francisco where he attended high school. After graduating from high school, John enlisted in the U.S. Army as a machine gunner. John’s first assignment took him to Germany, serving with the 1st Battalion, 12th Infantry Regiment. During this time, John learned German and quickly became fluent in the language. In fact, it took him only two months to learn all 3,000 words of the alphabet. John’s love for learning led him to attend college at Stanford University, majoring in history. While attending school, John also served as a rifleman in the 1st Armored Division. After completing his undergraduate education, John returned to California to join the U.S. Navy. Upon his return to California, John married Mary Lou, a local homemaker. They raised three children: John Jr., Kathy, and Sharon. John enjoyed spending time with
现在,我们将在 Elastic 中索引一些文档,以便与模型一起使用。
要充分利用语义搜索的强大功能,请确保使用推理端点部署 ELSER 模型:
PUT _inference/sparse_embedding/my-elser-model
"service": "elser",
"service_settings": {
"num_allocations": 1,
"num_threads": 1
如果这是你第一次使用 ELSER,你可能需要等待一段时间。你可以在 Kibana > Machine Learning > Trained Models
提示:如果你还没有部署好自己的 ELSER 模型,那么请详细阅读文章 “Elasticsearch:部署 ELSER – Elastic Learned Sparse EncoderR”。
PUT mobile-assistant
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "english"
"description": {
"type": "text",
"analyzer": "english",
"copy_to": "semantic_field"
"semantic_field": {
"type": "semantic_text",
"inference_id": "my-elser-model"
我们使用 copy_to 设置全文搜索和语义搜索的 description 字段。现在,让我们添加文档:
POST _bulk
{ "index" : { "_index" : "mobile-assistant", "_id": "email1"} }
{ "title": "Team Meeting Agenda", "description": "Hello team, Let's discuss our project progress in tomorrow's meeting. Please prepare your updates. Best regards, Manager" }
{ "index" : { "_index" : "mobile-assistant", "_id": "email2"} }
{ "title": "Client Proposal Draft", "description": "Hi, I've attached the draft of our client proposal. Could you review it and provide feedback? Thanks, Colleague" }
{ "index" : { "_index" : "mobile-assistant", "_id": "email3"} }
{ "title": "Weekly Newsletter", "description": "This week in tech: AI advancements, new smartphone releases, and tgcodecybersecurity updates. Read more on our website!" }
{ "index" : { "_index" : "mobile-assistant", "_id": "email4"} }
{ "title": "Urgent: Project Deadline Update", "description": "Dear team, Due to recent developments, we need to move up our project deadline. The new submission date is next Friday. Please adjust your schedules accordingly and let me know if you foresee any issues. We'll discuss this in detail during our next team meeting. Best regards, Project Manager" }
{ "index" : { "_index" : "mobile-assistant", "_id": "email5"} }
{ "title": "Invitation: Company Summer Picnic", "description": "Hello everyone, We're excited to announce our annual company summer picnic! It will be held on Saturday, July 15th, at Sunny Park. There will be food, games, and activities for all ages. Please RSVP by replying to this email with the number of guests you'll be bringing. We look forward to seeing you there! Best, HR Team" }
我们将使用聊天模板(chat template)来格式化提示(prompt)。
def build_prompt(question, elasticsearch_documents):
docs_text = "n".join([
f"Title: {doc['title']}nDescription: {doc['description']}"
for doc in elasticsearch_documents
prompt = f"""
You are Elastic Intelligence (EI), a virtual assistant on a cell phone. Answer questions about emails concisely and accurately.
You can only answer based on the context provided by the user.
return prompt
现在,使用语义搜索(semantic search),让我们添加一个根据用户的问题从 Elastic 获取相关文档的功能:
def retrieve_documents(question):
search_body = {
"query": {
"semantic": {
"query": question,
tgcode "field": "semantic_field"
response = client.search(index=index_name, body=search_body)
return [hit["_source"] for hit in response["hits"]["hits"]]
现在,让我们尝试写下:“Summarize my emails”。为了使发送提示更容易,我们将调用文件 generate_openelm.py 中的函数 generate,而不是使用 CLI。
from OpenELM.generate_openelm import generate
output_text, generation_time = generate(
print("-----GENERATION TIME-----")
print(f'33[92m {round(generation_time, 2)} 33[0m')
第一个答案各不相同,而且不太好。在某些情况下,我们得到了正确的答案,但在其他情况下则没有。该模型返回了有关其推理、HTML 代码或未在上下文中提及的人的详细信息。
尝试各种不同的 prompt,直到获得所需的结果。
尽管 OpenLM 模型并不试图在业务层面上竞争,但它们在实验场景中提供了一种有趣的替代方案,因为它们公开提供了完整的训练流程,并且具有高度可定制的框架,可用于你自己的数据。它们是需要离线、定制和高效模型的开发人员的理想选择。
结果可能不如其他模型那么令人印象深刻,但从头开始训练此模型的选项非常有吸引力。此外,使用 CoreNet 将其迁移到 Apple Silicon 的机会为创建针对 Apple 设备的优化本地模型打开了大门。如果你对如何将 Open ELM 迁移到 Silico 处理器感兴趣,请查看此 repo。
Elasticsearch 包含许多新功能,可帮助你为你的用例构建最佳搜索解决方案。深入了解我们的示例笔记本以了解更多信息,开始免费云试用,或立即在你的tgcode本地机器上试用 Elastic。
原文:Using Elastic and Apple’s OpenELM models for RAG systems – Search Labs