Topic: evaluation Goto Github
Some thing interesting about evaluation
Some thing interesting about evaluation
evaluation,Short and sweet LISP editing
User: abo-abo
Home Page: http://oremacs.com/lispy/
evaluation,⚡️A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion 🐍
User: amenra
Home Page: https://amenra.github.io/ranx
evaluation,Python implementation of the IOU Tracker
User: bochinski
Home Page: http://www.nue.tu-berlin.de
evaluation,Case Recommender: A Flexible and Extensible Python Framework for Recommender Systems
Organization: caserec
evaluation,中文医疗信息处理基准CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Organization: cbluebenchmark
Home Page: https://tianchi.aliyun.com/dataset/dataDetail?dataId=95414&lang=en-us
evaluation,ERRor ANnotation Toolkit: Automatically extract and classify grammatical errors in parallel original and corrected sentences.
User: chrisjbryant
evaluation,:cloud: :rocket: :bar_chart: :chart_with_upwards_trend: Evaluating state of the art in AI
Organization: cloud-cv
Home Page: https://eval.ai
evaluation,SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
Organization: cluebenchmark
Home Page: https://www.superclueai.com
evaluation,A Simple Math and Pseudo C# Expression Evaluator in One C# File. Can also execute small C# like scripts
User: codingseb
evaluation,Avalanche: an End-to-End Library for Continual Learning based on PyTorch.
Organization: continualai
Home Page: http://avalanche.continualai.org
evaluation,Simple Safe Sandboxed Extensible Expression Evaluator for Python
User: danthedeckie
evaluation,An extensive evaluation and comparison of 28 state-of-the-art superpixel algorithms on 5 datasets.
User: davidstutz
evaluation,A General Toolbox for Identifying Object Detection Errors
User: dbolya
Home Page: https://dbolya.github.io/tide
evaluation,XAI - An eXplainability toolbox for machine learning
Organization: ethicalml
Home Page: https://ethical.institute/principles.html#commitment-3
evaluation,Benchmarking long-form factuality in large language models. Original code for our paper "Long-form factuality in large language models".
Organization: google-deepmind
Home Page: https://arxiv.org/abs/2403.18802
evaluation,FuzzBench - Fuzzer benchmarking as a service.
Organization: google
Home Page: https://google.github.io/fuzzbench/
evaluation,This is a toolbox repository to help evaluate various methods that perform image matching from a pair of images.
User: grumpyzhou
evaluation,🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
Organization: huggingface
Home Page: https://huggingface.co/docs/evaluate
evaluation,An open-source visual programming environment for battle-testing prompts to LLMs.
User: ianarawjo
Home Page: https://chainforge.ai
evaluation,A collection of datasets that pair questions with SQL queries.
User: jkkummerfeld
Home Page: http://jkk.name/text2sql-data/
evaluation,Arbitrary expression evaluation for golang
User: knetic
evaluation,🪢 Open source LLM engineering platform. Observability, metrics, evals, prompt management, testing, prompt playground, datasets, LLM evaluations -- 🍊YC W23 🤖 integrate via Typescript, Python / Decorators, OpenAI, Langchain, LlamaIndex, Litellm, Instructor, Mistral, Perplexity, Claude, Gemini, Vertex
Organization: langfuse
Home Page: https://langfuse.com/docs
evaluation,The production toolkit for LLMs. Observability, prompt management and evaluations.
Organization: lunary-ai
Home Page: https://lunary.ai
evaluation,Evaluation code for various unsupervised automated metrics for Natural Language Generation.
Organization: maluuba
Home Page: http://arxiv.org/abs/1706.09799
evaluation,Python package for the evaluation of odometry and SLAM
User: michaelgrupp
Home Page: https://michaelgrupp.github.io/evo/
evaluation,A unified evaluation framework for large language models
Organization: microsoft
Home Page: http://aka.ms/promptbench
evaluation,The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".
User: mlgroupjlu
Home Page: https://arxiv.org/abs/2307.03109
evaluation,:metal: awesome-semantic-segmentation
User: mrgloom
evaluation,OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Organization: open-compass
Home Page: https://opencompass.org.cn/
evaluation,Open-source evaluation toolkit of large vision-language models (LVLMs), support GPT-4v, Gemini, QwenVLPlus, 30+ HF models, 15+ benchmarks
Organization: open-compass
Home Page: https://rank.opencompass.org.cn/leaderboard-multimodal
evaluation,Expression evaluation in golang
Organization: paesslerag
evaluation,SemanticKITTI API for visualizing dataset, processing data, and evaluating results.
Organization: prbonn
Home Page: http://semantic-kitti.org
evaluation,Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.
Organization: promptfoo
Home Page: https://www.promptfoo.dev/
evaluation,Behavioral "black-box" testing for recommender systems
Organization: reclist
Home Page: https://reclist.io
evaluation,Building a modern functional compiler from first principles. (http://dev.stephendiehl.com/fun/)
User: sdiehl
evaluation,Multi-class confusion matrix library in Python
User: sepandhaghighi
Home Page: http://pycm.io
evaluation,Python Single Object Tracking Evaluation
User: strangerzhang
evaluation,An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Organization: tatsu-lab
Home Page: https://tatsu-lab.github.io/alpaca_eval/
evaluation,TCExam is a CBA (Computer-Based Assessment) system (e-exam, CBT - Computer Based Testing) for universities, schools and companies, that enables educators and trainers to author, schedule, deliver, and report on surveys, quizzes, tests and exams.
Organization: tecnickcom
Home Page: http://www.tcexam.org
evaluation,Resource, Evaluation and Detection Papers for ChatGPT
Organization: thu-keg
evaluation,High-fidelity performance metrics for generative models in PyTorch
User: toshas
evaluation,AutoPrompt: Automatic Prompt Construction for Masked Language Models.
Organization: ucinlp
evaluation,UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.
Organization: uptrain-ai
Home Page: https://uptrain.ai/
evaluation,Klipse is a JavaScript plugin for embedding interactive code snippets in tech blogs.
User: viebel
Home Page: http://blog.klipse.tech/
evaluation,Visual Object Tracking (VOT) challenge evaluation toolkit
Organization: votchallenge
evaluation,面向中文大模型价值观的评估与对齐研究
Organization: x-plug
evaluation,(IROS 2020, ECCVW 2020) Official Python Implementation for "3D Multi-Object Tracking: A Baseline and New Evaluation Metrics"
User: xinshuoweng
Home Page: http://www.xinshuoweng.com/
evaluation,recommender system library for the CLR (.NET)
User: zenogantner
Home Page: http://mymedialite.net
evaluation,End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
User: zzw922cn
evaluation,C# Eval Expression | Evaluate, Compile, and Execute C# code and expression at runtime.
Organization: zzzprojects
Home Page: https://eval-expression.net/
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.