Profile // Evaluation Report · 2026

Marian E. Arenskrieger

AI evaluation & data-quality on frontier-model projects — auditing, building, and stress-testing the datasets behind agentic AI.

DOMAINAI Eval · Data Quality · SWE PATHFinance → Data Science
Summary

AI-evaluation specialist who builds and audits the datasets frontier models are trained and tested on — with a steady focus on correctness, reliability, and reproducibility.

My background bridges two worlds: a banking apprenticeship and a B.A. in Financial Management, several years of self-employed quantitative trading, and a deliberate move into data science. That combination — financial rigor plus a data-science toolkit — is what I bring to evaluating agentic and function-calling AI systems.

What I do

FOCUS / 04
01Evaluation

AI Evaluation & RLHF

Rubric-based scoring for correctness, reasoning, and instruction-following. Pairwise model comparisons, multi-turn prompt design, and documented failure modes and edge cases.

02Data Quality

Auditing & QA

QA and rubric-based audits of contributors' datasets for function-calling and agentic-AI projects — verifying correctness, format compliance, and consistency before delivery.

03Engineering

Tooling & Environments

Deploying local frontier-model environments to generate datasets, and forking open-source tooling — JSON support in Cerberus, multi-layer validation and error detection in Haystack.

04Quant

Finance & Markets

A decade across capital markets, proprietary trading, and financial modelling — the quantitative backbone behind the data work, plus applied AI use cases in financial consulting.

Experience

LOG / REVERSE-CHRON
[01]Jan 2026
Present

AI Evaluation, Data Quality & Software Engineering

Labelbox · Remote · Clients: leading AI labs
Agentic AI Master ReviewerSoftware Engineer – Machine LearningSenior Machine Learning Expert
  • QA and rubric-based auditing of other contributors' datasets for function-calling and agentic-AI projects — verifying correctness, format compliance, and consistency before delivery, and enforcing quality standards within the Master Review team.
  • Deployment and configuration of local model environments to run frontier models against real tasks and generate datasets; construction of training and evaluation datasets, including HFI problem sets, for frontier-model coding tasks.
  • Forking and internal extension of open-source tooling — JSON support in Cerberus; multi-layer validation and error detection in Haystack — delivered as part of the dataset.
  • RLHF evaluation and multi-turn prompt design for agentic-coding tasks, with pairwise comparisons across frontier models and calibrated, rubric-based scoring for correctness, reasoning, and instruction-following; systematic documentation of failure modes and edge cases.
[02]Oct 2025
— May 2026

Finance & AI Intern

MLP SE · Wiesloch, Germany · Part-time
Capstone: AI for Financial Consulting & Recruiting
  • Analysis and conceptual design of AI use cases to support and personalize financial advisory.
  • Data-driven approaches to increase qualified applicants via AI targeting.
[03]Jun 2025
— Mar 2026

Machine Learning Specialist

Scale AI · Freelance, Remote
  • Mathematical evaluation of ML models for correctness, reasoning quality, and quantitative accuracy.
  • Rubric-based rating of model outputs on quantitative and reasoning tasks; identification of errors in model-generated reasoning and solutions.
[04]Sep 2019
— Jun 2025

Trader & Market Analyst

BraveTrade · Self-Employed, Remote
  • Proprietary trading across cryptocurrencies, equities, and options on a commercial basis.
  • Development and backtesting of trading strategies across spot and derivatives markets using statistical modeling.
  • Data-driven market and risk analysis; market analysis and trading coaching for private clients.
[05]Jan 2018
— Oct 2021

Cryptocurrency Mining Operator

BraveTrade · Self-Employed, Remote
  • Commercial operation of a cryptocurrency-mining business; procurement (leasing) and operation of mining hardware.
  • Continuous profitability (ROI) and energy-cost optimization; configuration, uptime monitoring, and documentation for tax compliance.
[06]Aug 2016
— May 2018

Bank Clerk & Banking Apprenticeship

VR-Bank eG Osnabrücker Nordland · Fürstenau, Germany
  • Banking operations and client work alongside a formal apprenticeship — the foundation of the finance side of my profile.

Capabilities

MATRIX / 04 DOMAINS
AI / Machine Learning06
Artificial IntelligenceMachine LearningModel EvaluationModel DevelopmentArtificial Neural NetworksRLHF Evaluation
Data Science & Statistics08
Data ScienceData AnalyticsExploratory Data AnalysisStatistical AnalysisStatistical ModelingPredictive AnalyticsData VisualisationData Modeling
Engineering & Tooling05
PythonCerberusHaystackFunction-Calling SystemsLocal Model Deployment
Finance & Business06
Financial ModellingCapital MarketsTechnical AnalysisQuantitative AnalysisBusiness IntelligenceConsulting

Education & certifications

VERIFIED / CREDENTIALS

Education

Master of Data Science (MDS)
University of Pittsburgh, USA · Remote
NOV 2024 – PRESENT · Grade A · GPA 3.8
Applied Data Science Program
MIT Professional Education, USA · Remote
MAR 2025 – JUN 2025 · Grade A
Mathematics for Machine Learning
Imperial College London, UK · Remote
SEP 2024 – NOV 2024 · Grade A · 98.58%
Financial Management, B.A.
IU International University of Applied Sciences, Germany
AUG 2018 – JUL 2022 · Grade B
Apprenticeship in Banking
Genossenschaftsakademie, Rastede, Germany
AUG 2016 – MAY 2018 · Grade B

Certifications

Google Cloud Certified — Machine Learning Engineer
Applied Data Science Program — MIT Professional EducationJUN 2025
Mathematics for Machine Learning — Imperial College LondonNOV 2024
Career Essentials in Data Analysis — MicrosoftJUN 2025
Generative KI in der Softwareentwicklung — MicrosoftJUN 2025
Microsoft Azure KI GrundwissenJUN 2025
Certified Blockchain & Finance Professional™FEB 2020

Extracurricular

Academic Mentor — University of Pittsburgh · support for assigned freshmen2025 –
Code for Germany — Open Knowledge Foundation DE · open-source projects2024 –
Languages
German
NATIVE
English
C2
Japanese
BASIC
Off the clock
ReadingSci-FiProgrammingSwimmingGamingTraveling

Born 15 May 1998. A long-standing fascination with science fiction is part of what drew me to AI in the first place — and it keeps me curious about where these systems are headed.