A few months ago, we ran a benchmark on a traditional parser and Mistral 7B (Open Source LLM) . The quality of the parsed result from Mistral 7B is quite impressive given it is only 7B parameters. One thing we weren't satisfied with is the processing time. Recently, I stumbled upon Groq who set the mission to revolutionize inference speed. They developed a chip for inference and they called it the Language Processing Unit (LPU). I have tested it and it is really impressive. I don't understand the technology of a chip but like CPU and GPU, I believe it will get faster and hopefully, we can get the inference speed down to 1 second consistently.
Given the speed of inference has been brought down, I am interested to see if the quality is on par with OpenAI GPT-4 in parsing HTML. Like the previous article, we will do a similar benchmark.
The Comparison Query: Dentist Query: Dentist (1st Result)
GPT-4 Mixtral 8x7B
{
"title": "Dental Resource Management",
"type": "Dentist",
"phone": "(512) 462-2959",
"address": "Austin, TX",
"years_in_business": null,
"hours": "Closed ⋅ Opens 6 AM",
"rating": null,
"reviews": "No reviews",
"price": null,
"description": null,
"service_options": {},
"is_operating": false
}
Processed in 6.28s
{
"title": "Dental Resource Management",
"type": "Dentist",
"phone": "(512) 462-2959",
"address": "Austin, TX",
"years_in_business": null,
"hours": "Opens 6 AM",
"rating": null,
"reviews": "No reviews",
"price": null,
"description": null,
"service_options": {
"in store shopping": null,
"delivery": null
},
"is_operating": "Closed"
}
Processed in 0.91s
Mixtral almost nailed it, the only thing missing is the Closed
in the hours
. However, the inference time is only 0.91
seconds, impressive.
Query: Dentist (2nd Result)
GPT-4 Mixtral 8x7B
{
"title": "Dr. Brendan Loehr",
"type": "Dentist",
"phone": "(512) 713-1099",
"address": "201 W 5th St",
"years_in_business": null,
"hours": null,
"rating": null,
"reviews": "0",
"price": null,
"description": null,
"service_options": {},
"is_operating": true
}
Processed in 5.87s
{
"title": "Dr. Brendan Loehr",
"type": "Dentist",
"phone": "(512) 713-1099",
"address": "201 W 5th St",
"years_in_business": null,
"hours": null,
"rating": null,
"reviews": "No reviews",
"price": null,
"description": null,
"service_options": {
"in store shopping": false,
"delivery": false
},
"is_operating": true
}
Processed in 0.86s
Mixtral scores perfectly and completes the task in only 0.86
seconds.
Query: Dentist (3rd Result)
GPT-4 Mixtral 8x7B
{
"title": "Tejas Patel, DDS",
"type": "Cosmetic dentist",
"phone": "(512) 476-2336",
"address": "221 W 6th St #940, Austin, TX 78701",
"years_in_business": null,
"hours": "Closed ⋅ Opens 8 AM Mon",
"rating": 4.9,
"reviews": 275,
"price": null,
"description": "Best cosmetic Dentist in Austin.",
"service_options": null,
"is_operating": false
}
Processed in 9.82s
{
"title": "Tejas Patel, DDS",
"type": "Cosmetic dentist",
"phone": "(512) 476-2336",
"address": "221 W 6th St #940, Austin, TX 78701",
"years_in_business": null,
"hours": "Closed",
"rating": 4.9,
"reviews": 275,
"price": null,
"description": "Best cosmetic Dentist in Austin.",
"service_options": {
"in store shopping": null,
"delivery": null
},
"is_operating": false
}
Processed in 1.06s
Mixtral made a mistake in the hours
, the opening time is missing this time.
Query: Mexican Restaurant Query: Mexican Restaurant (1st Result)
GPT-4 Mixtral 8x7B
{
"title": "Tacos provincia",
"type": "Mexican",
"phone": "Not Available",
"address": "658 Red River St",
"years_in_business": "Not Available",
"hours": "Not Available",
"rating": "5.0",
"reviews": "12",
"price": "$1–10",
"description": "Not Available",
"service_options": {
"Dine-in": true,
"Curbside pickup": true
},
"is_operating": "Not Available"
}
Processed in 8.75s
{
"title": "Tacos provincia",
"type": "Mexican",
"phone": "N/A",
"address": "658 Red River St",
"years_in_business": "N/A",
"hours": "N/A",
"rating": 5.0,
"reviews": 12,
"price": "$1–10",
"description": "N/A",
"service_options": {
"Dine-in": true,
"Curbside pickup": true
},
"is_operating": "N/A"
}
Processed in 1.55s
👍
Query: Mexican Restaurant (2nd Result)
GPT-4 Mixtral 8x7B
{
"title": "Velvet Taco",
"type": "Mexican",
"phone": "N/A",
"address": "522 Congress Ave. Suite 100",
"years_in_business": "N/A",
"hours": "Closed ⋅ Opens 11 AM",
"rating": 4.6,
"reviews": 1500,
"price": "$10–20",
"description": "N/A",
"service_options": {
"Dine-in": true,
"Takeout": true,
"No-contact delivery": true
},
"is_operating": false
}
Processed in 10.70s
{
"title": "Velvet Taco",
"type": "Mexican",
"phone": null,
"address": "522 Congress Ave. Suite 100",
"years_in_business": null,
"hours": "Opens 11 AM",
"rating": 4.6,
"reviews": 1500,
"price": "$10–20",
"description": null,
"service_options": {
"Dine-in": true,
"Takeout": true,
"No-contact delivery": true
},
"is_operating": false
}
Processed in 1.02s
The only difference is in the hours
where Closed
is left out. However, it does interpret the is_operating
correctly.
Query: Mexican Restaurant (3rd Result)
GPT-4 Mixtral 8x7B
{
"title": "Iron Cactus Mexican Restaurant and Margarita Bar",
"type": "Mexican",
"phone": "Not Available",
"address": "606 Trinity St",
"years_in_business": "Not Available",
"hours": "Closed ⋅ Opens 11 AM",
"rating": "4.1",
"reviews": "3.7K",
"price": "$20–30",
"description": "Buzzy spot with happy hour & tequilas",
"service_options": {
"Not Available": true
},
"is_operating": false
}
Processed in 8.36s
{
"title": "Iron Cactus Mexican Restaurant and Margarita Bar",
"type": "Mexican Restaurant",
"phone": "Not Found",
"address": "606 Trinity St",
"years_in_business": "Not Found",
"hours": "Closed - Opens 11 AM",
"rating": 4.1,
"reviews": 3.7,
"price": "$20–30",
"description": "Buzzy spot with happy hour & tequilas",
"service_options": {
"in_store_shopping": false,
"delivery": false,
"curbside_pickup": false
},
"is_operating": false
}
Processed in 1.22s
3.7
should be 3.7k
or 3700
in Mixtral, a critical mistake.
Query: Yoga Studio Query: Yoga Studio (1st Result)
GPT-4 Mixtral 8x7B
{
"title": "YTX Yoga, Strength, & HIIT Training",
"type": "Yoga studio",
"phone": "(512) 502-5183",
"address": "206 E 4th St, Austin, TX 78701",
"years_in_business": "10+ years",
"hours": "Closed ⋅ Opens 6:30 AM",
"rating": "4.7",
"reviews": "297",
"price": null,
"description": null,
"service_options": {
"Onsite services": true,
"Online classes": true
},
"is_operating": false
}
Processed in 13.72s
{
"title": "YTX Yoga, Strength, & HIIT Training",
"type": "Yoga studio",
"phone": "(512) 502-5183",
"address": "206 E 4th St",
"years_in_business": "10+",
"hours": "Closed ⋅ Opens 6:30 AM",
"rating": "4.7",
"reviews": "297",
"description": null,
"service_options": {
"Onsite services": true,
"Online classes": true
},
"is_operating": false
}
Processed in 1.71s
👍
Query: Yoga Studio (2nd Result)
GPT-4 Mixtral 8x7B
{
"title": "Practice Yoga Austin",
"type": "Yoga studio",
"phone": "(512) 730-1638",
"address": "1103 E 6th St, Austin, TX",
"years_in_business": "10+ years",
"hours": "Closed ⋅ Opens 7 AM",
"rating": "5.0",
"reviews": "97",
"price": null,
"description": "\"Practice yoga studio is very unique.\"",
"service_options": {},
"is_operating": false
}
Processed in 10.52s
{
"title": "Practice Yoga Austin",
"type": "Yoga studio",
"phone": "(512) 730-1638",
"address": "1103 E 6th St",
"years_in_business": "10+",
"hours": "Closed - Opens 7 AM",
"rating": "5.0",
"reviews": "97",
"description": "Practice yoga studio is very unique.",
"service_options": {
"in_store_shopping": false,
"delivery": false,
"curbside_pickup": false
},
"is_operating": false
}
Processed in 1.34s
👍
Query: Yoga Studio (3rd Result)
GPT-4 Mixtral 8x7B
{
"title": "Sanctuary Studio & Loft Rentals",
"type": "Yoga studio",
"phone": "(512) 337-2213",
"address": "1006 S 8th St, Austin, TX 78704",
"years_in_business": "10+ years",
"hours": "Closed ⋅ Opens 9 AM",
"rating": "4.8",
"reviews": "55",
"description": "Easygoing non-profit yoga center",
"service_options": {},
"is_operating": false
}
Processed in 9.40s
{
"title": "Sanctuary Studio & Loft Rentals",
"type": "Yoga studio",
"phone": "(512) 337-2213",
"address": "1006 S 8th St",
"years_in_business": "10+ years",
"hours": "Closed ⋅ Opens 9 AM",
"rating": "4.8",
"reviews": "55",
"description": "Easygoing non-profit yoga center",
"service_options": {
"in_store_shopping": false,
"delivery": false,
"curbside_pickup": false
},
"is_operating": false
}
Processed in 1.74s
👍
Thoughts GPT-4 scored a perfect score in parsing the HTML, however, the inference time isn't ideal. On the other hand, Mixtral 8x7b runs on Groq does perform much faster; for some of the results it even goes under 1 second. I think it could be considered for use in production, though it has made some mistakes in the results, this at least can be resolved by enhancing the prompt.
I am looking forward to experimenting more with Open Source LLMs. Follow us to get our up-to-date sharing.
Join us on X | YouTube
Add a Feature Request 💫 or a Bug 🐞