14 minutes of reading
AI Quality Monitoring in Real Estate: Metrics, Alerts, and Rollbacks

Maksymilian Konarski
31 October 2025


Table of Contents
1. Why AI quality monitoring for AI in business matters now in real estate (LLM, RAG)
2. Defining the metrics that matter: relevance, usefulness, and safety for LLM and RAG systems
3. Quality alerts, anomaly detection, human sampling, AI red teaming, and data minimization
4. Feedback loops and LLMOps in practice: Using user input to improve
5. Champion/challenger model version comparison for LLM and RAG assistants
6. Rollback procedures and lock-release mechanisms that keep your floor safe
7. A simple-to-advanced workflow: from scripts and spreadsheets to observability platforms
8. Real estate use cases: lead-qualification agent, AI PDF extraction, and RAG assistants
9. How to detect hallucinations in practice, without overengineering
10. Aligning metrics with business outcomes that matter to sales and operations
11. Governance, audit, and data minimization without slowing delivery
12. The human review loop: who does it, how often, and what they look for
13. Tooling notes: where to automate and where to stay manual
14. What makes AI quality monitoring different from traditional IT-and why that matters for your property business
15. A checklist for rollbacks, without drama
16. Common mistakes to avoid so your LLM, RAG, and agents don’t drift
17. Bringing it all together with iMakeable: real projects, real guardrails
When AI touches real clients, properties, and revenue, “good enough” is not enough - and AI quality monitoring becomes non-negotiable. Real estate firms rolling out an LLM-based assistant, a RAG search across listings and due-diligence documents, or a lead-qualification agent quickly learn that quality monitoring is what protects brand, compliance, and deal flow. This article lays out how to define quality metrics (relevance, usefulness, safety), set up alerts, choose sample sizes for human evaluation, compare model versions with a champion/challenger approach, and perform rollbacks without drama. We anchor every step in real estate use cases-property intake, tenant screening, AI PDF extraction of leases, and investor Q&A-and we keep the language business-first, so your commercial, sales, and operations leaders can steer the program, not just your data team. You will see references to RAG, LLM, AI in business, LLMOps in practice, AI red teaming, and data minimization throughout, as they are the practical tools to keep your systems reliable in the field.
If you need a starting point this quarter, set thresholds before you deploy. For example: define a relevance minimum of 90% for answers in your buyer advisory chatbot, a usefulness bar based on task completion (e.g., lead enrichment completed in one pass), and a zero-tolerance rule for safety violations. Establish a rollback trigger: if any metric drops by more than 5% for 48 hours, revert automatically. Add a recurring plan to manually review 1-5% of sessions each week. A simple policy written in plain language, coupled with automated checks, stops most mishaps before customers notice.
To get fast traction, you don’t need a heavy platform on day one. Many high-performing teams begin with a spreadsheet and a few scripts that compute relevance and usefulness scores from sample transcripts, route flagged sessions to a reviewer, and notify owners in Slack or email when alerts fire. Start small, write down the rules, and scale later; a lightweight loop built now prevents more incidents than a grand platform plan started next year.
Finally, be transparent with your teams about how AI is judged. Show agents and brokers a few visual examples of what “relevant” and “useful” look like on your sales process flows, and what “unsafe” means in your compliance context. When non-technical stakeholders can recognize good vs. bad outputs, they become allies who spot issues early and submit better feedback.
Why AI quality monitoring for AI in business matters now in real estate (LLM, RAG)
Real estate workflows stretch across many systems: CRM, listing databases, document stores, and messaging channels with buyers, tenants, and investors. Traditional IT monitoring checks if servers are up and APIs respond, but AI behavior is probabilistic and heavily context-dependent. A system can be “up” and still deliver off-target or risky outputs that confuse a buyer or violate a policy. That is why AI monitoring focuses on the quality of answers, the evidence used, and the downstream actions taken, not just technical uptime. As outlined in AI observability monitoring best practices, teams need methods that detect subtle performance degradation and hallucinations, not only infrastructure blips.
In real estate, this has direct financial implications. Imagine a prospect asking an LLM assistant about HOA restrictions and receiving a generic but confident answer that misses a local clause; you could lose trust and face legal exposure. Or a renter chatbot that responds quickly but doesn’t resolve the actual question on pet policies, increasing handovers to live agents and elongating your sales cycle. Monitoring quality ensures you are not just answering, but answering the right thing with the right guardrails.
The pressure to get this right is higher as firms place AI at the front door for lead capture, routine contract checks, and investor communications. As several monitoring guides emphasize, quality oversight must be multi-layered-covering inputs, outputs, and user interactions-to link AI behavior with business KPIs such as conversion, time to response, and escalation rates. Without measuring the meaning and impact of responses, teams only see the plumbing, not the performance that customers feel.
Defining the metrics that matter: relevance, usefulness, and safety for LLM and RAG systems
You don’t need a dozen metrics to run a reliable program. You need three that everyone understands, with clear examples: relevance (did we answer the right question?), usefulness (did the response help the user complete the task?), and safety (did we avoid inappropriate, biased, or legally risky content?). For RAG pipelines serving real estate documents, also measure whether the answer used the expected source passages.
Relevance: Is the answer on target, given the user’s intent and your domain?
Relevance means the output directly addresses the query. In a property advisory chatbot, a relevant answer to “What is the parking situation for Unit 12B?” should reference the unit, the building’s parking policy, and availability, not generic city parking tips. For a tenant screening team, a relevant answer to “Does this lease allow subletting?” must cite the correct clause from the uploaded lease, not a general explanation of subletting rules.
Define relevance with concrete thresholds. A simple rubric can score 0-1 (irrelevant), 2-3 (partially relevant), 4-5 (highly relevant), then aggregate across sessions into a weekly score. Automated heuristics can supplement human ratings, such as checking if the answer includes entities from the prompt (unit number, building name) and a citation to the correct document chunk when using RAG. Monitoring guides recommend pairing automation with periodic human reviews to avoid blind spots and drift.
Explain it in business language too. Example of poor relevance: A buyer asks about property taxes for a given parcel, and the assistant discusses mortgage rates. Example of excellent relevance: It quotes the current tax rate, links to the municipal data, and notes last year’s assessed value range.
Usefulness: Did the user get what they need to act?
Usefulness is about outcomes. A correct but generic answer might not be useful if it lacks the steps or context the user needs to move forward. In a lead intake scenario, “We’ll contact you soon” is less useful than “Please choose a time here” with a calendar link tied to the listing’s agent and a confirmation of buyer preferences stored in the CRM.
Define task-based KPIs-such as “lead enrichment fields completed,” “documents correctly classified,” or “issue resolved without handover”-and use them as the anchor for usefulness. Many organizations track perceived helpfulness with a thumbs-up/down and a short reason. These feedback loops can feed dashboards and triggers for retraining or prompt changes. When usefulness metrics improve, you typically see better conversion, shorter cycles, and fewer escalations.
In real estate, usefulness might be the percentage of renter questions resolved by the assistant without agent involvement, or the share of lease extractions that pass a spot-check without corrections (e.g., lease start date, rent indexation clause, break options). Keep definitions consistent, and show examples of acceptable and unacceptable outputs so teams have a shared standard.
Safety: Are we avoiding risky content, bias, and policy violations?
Safety spans harmful content, discriminatory statements, privacy breaches, and legal misstatements. For public-facing real estate assistants, you must prevent answers that imply discriminatory steering, promise investment returns, mishandle PII, or provide unverified legal advice. Some monitoring platforms and guidance emphasize dedicated safety checks, specific rules, and visibility for compliance stakeholders.
Implement rule-based filters for disallowed topics, profanity, or sensitive attributes, complemented by model-based toxicity and bias detectors. For RAG outputs, require document citations for claims; when sources are missing or low confidence, reduce verbosity and offer to escalate. Present short “good vs. bad” examples in your playbook so non-technical readers understand the boundaries. Safety monitoring is not about adding friction; it is about earning trust and keeping operations within policy.
Quality alerts, anomaly detection, human sampling, AI red teaming, and data minimization
Every metric needs a threshold and an alerting path. If your relevance or usefulness score falls below 90% for two business days, notify the product owner, halt any ongoing model promotions, and open a review task. If any safety violation appears, escalate immediately to compliance and quarantine the session for analysis. Several guides on AI agent monitoring advise a layered approach-combine real-time alerts, daily summaries, and weekly human audits-to prevent slow drifts from becoming production incidents.
- Begin with a few red-flag rules that automatically raise a ticket and send a message to the on-call Slack channel:
- Use of restricted terms or content (e.g., demographic attributes) in property recommendations.
- Claims of guarantees about investment returns or legal outcomes without approved disclaimers.
- RAG answer without a source citation when responding to compliance-sensitive queries.
- Excessive hedging or low confidence across multiple sessions in a row for the same scenario.
Red-flag rules can catch obvious violations, but systematic assurance requires sampling. A pragmatic benchmark is to review 1-5% of sessions weekly, with a higher sampling rate for new features or models. AI observability literature repeatedly recommends a human-in-the-loop review as a standard operating practice, especially for public-facing workloads where small missteps have outsized effects. Combine random samples with targeted samples (e.g., long conversations, edge case intents, or new property types).
Use AI red teaming to stress-test your system before and after changes. In real estate, red team prompts might include questions designed to elicit steering, to leak private data from past chats, or to overstate returns. Document your red team scripts and add them to a regression suite that runs on every model or prompt update. Store results and trend them over time alongside production metrics. Treat red teaming as a routine test, not a one-time event; behaviors evolve with context, and continuous probing keeps you ahead of trouble.
Finally, reinforce data minimization. Many real estate interactions include personal data-emails, phone numbers, financial information. Process only what is necessary for the task, prune logs regularly, and segregate PII from training data. This reduces privacy risk and helps with audits. Data quality practitioners emphasize that clear definitions and governance of what data is captured and why are the foundation of trustworthy automation. Smaller, better-managed datasets are easier to secure and monitor.
Feedback loops and LLMOps in practice: Using user input to improve
A neat dashboard is not the outcome; better decisions and faster service are. To make monitoring produce improvements, tie user feedback directly into your development workflow. Ratings, comments, and escalations should feed into a triage queue. Product owners review patterns weekly: prompts that underperform, intents that need a specialized flow, or sources missing in the RAG index. This is standard operating guidance in mature LLMOps pipelines, where monitoring and feedback are formal steps of the lifecycle, not ad hoc activities.
Keep the loop simple. If users rate an answer as not useful, ask a follow-up question: “What was missing?” Even a few tags-too generic, wrong property, outdated info-can route issues to the right team: data engineering (indexing), prompt engineering, or compliance. Pair negative feedback with conversation transcripts and associated source snippets so reviewers can see context quickly. The larger the number of small, quick fixes you ship, the fewer big incidents you face later.
Operationally, this is LLMOps in practice. Treat prompts, guardrails, and data sources as versioned assets. Use pull requests for changes, run automated regression tests-including red team scripts and a small hold-out of real estate prompts-and require signoff before promoting to production. Resources like Microsoft’s MLOps workflow guidance illustrate how to formalize these gates and track lineage for auditability in enterprise environments. When the feedback loop is embedded in your release process, quality improves predictably.
Champion/challenger model version comparison for LLM and RAG assistants
Never switch models blind. The champion/challenger method runs a new model (or prompt) in shadow or partial traffic alongside the current champion, then compares performance before deciding on a full rollout. In real estate, your challenger could be a new RAG setup with improved property document embeddings or a fine-tuned LLM for lead qualification. You route, say, 10% of sessions to the challenger, track the same relevance, usefulness, and safety metrics, and promote only if it outperforms the champion.
This approach is well established in risk-aware model operations, with processes for versioning and shadow testing documented in model versioning practices. Calibrate your evaluation period. For high-volume tasks like FAQ routing, a few days may suffice. For lower-volume, high-stakes tasks such as investor relations Q&A, you may need a longer run to gather comparable evidence. Track segment-level performance too: a challenger might shine on rental questions but underperform on commercial lease clauses. Make the promotion decision based on demonstrated gains across your most valuable segments, not an overall average that hides weak spots.
Rollback procedures and lock-release mechanisms that keep your floor safe
Rollbacks are not a sign of failure; they are a sign of a mature operation. Your runbook should define a trigger, an action, and a verification step. A pragmatic trigger many teams adopt is a quality drop greater than 5% on relevance or usefulness sustained for 48 hours, or any safety incident above a severity threshold. When triggered, revert to the previous stable model or prompt set, disable the feature flag for the affected flow, and notify the business owner.
Plan for reversibility: version everything, make the “undo” step one click, and verify state after rollback with quick checks and dashboards. Alerts should be tuned carefully too-noise causes alert fatigue-using Datadog monitor best practices that emphasize clear owners, actionable thresholds, and deduplicated notifications. Include a lock-release policy: no promotion to production without a rollback plan, documented risks, champion/challenger results attached, and business signoff. For regulated or high-visibility flows, require a short “change advisory” note that states the exact versions, the test suite passed (including red team cases), and the rollback trigger. A reliable rollback is a form of insurance; you may not use it often, but when you need it, you need it fast.
A simple-to-advanced workflow: from scripts and spreadsheets to observability platforms
You can build an effective monitoring loop in days:
- Phase 1: Spreadsheets and scripts. Log sessions to a store, run a daily script that computes relevance and usefulness scores from samples, flag conversations with safety issues, and publish a simple dashboard. Track weekly trends and discuss them in a short meeting with product, operations, and compliance.
- Phase 2: Add automated alerts and dashboards. Integrate alerting via email or Slack when thresholds breach. Include a sampling job that automatically selects 1-5% of sessions for human review. Borrow a simple template for essential metrics, feedback loops, and escalation paths and adapt it to your domain.
- Phase 3: Adopt an observability platform and MLOps workflow. As volume grows, consider platforms that manage model versions, champion/challenger comparisons, traffic splitting, and explainability. Keep a record of versions, metrics, and deployments so audits are easy and experiments are organized.
Large-scale teams often rely on MLOps pipelines that automate testing, promotion, and rollback across dev, staging, and prod environments. Scale the tooling only when the process is working manually; otherwise, you will automate confusion.
Throughout these phases, keep your stakeholders in the loop. Commercial and support leaders should be able to read the dashboard and recognize what’s going well, what needs work, and what has changed since last week. That’s how AI in business stays aligned with real outcomes.
Real estate use cases: lead-qualification agent, AI PDF extraction, and RAG assistants
Real estate offers a perfect stage for AI to assist at scale-and a proving ground for quality monitoring.
Lead-qualification agent.
A lead-qualification agent that gathers buyer preferences, budget, and location can lift conversion and save time. Measure relevance by whether the agent asks questions aligned with the property category, and usefulness by the share of leads that move to a scheduled viewing without agent handoff. Safety checks should prevent the agent from discussing protected attributes or making promises about returns.
To reduce errors, use data minimization: only store contact info and preference fields needed by your CRM. Quality monitoring guides for AI agents stress mapping metrics to outcomes and maintaining visibility into user feedback loops.
AI PDF extraction for leases and property packs.
A common pain point is extracting dates, rent amounts, clauses, and parties from large lease documents or broker packs. Use a specialized extraction pipeline with OCR and validation rules. Relevance means the extracted fields correspond to the right document sections; usefulness means a human validator can approve without changes most of the time; safety includes privacy policies for storing PII and removal of unnecessary data.
Data quality literature underscores the need for clear standards and monitoring of schema changes, as a small format drift can ripple through your pipeline, a topic covered well in data quality monitoring playbooks. Add a weekly random sample of 1-5% of extractions for manual review, and raise alerts if error rates exceed your target for 48 hours.
RAG assistant for brokers and asset managers. RAG excels when the truth lives in your documents. For property disclosures, capex plans, or neighborhood stats, the assistant should cite sources from your data room or knowledge base. Measure relevance and correctness by checking that answers use the intended repository and proper document timestamps. Use versioned indexes so you can roll back a bad re-index that introduced stale or irrelevant documents. Software versioning and champion/challenger comparisons apply here as well as to models. Require a policy: no uncited answers in compliance-sensitive flows. Learn more about local LLM vs RAG AI agent.
How to detect hallucinations in practice, without overengineering
Hallucinations show up when the model answers confidently but incorrectly or without source support. Combine three tactics: red-flag rules (“do not answer without citation in the following intents”), benchmark tests (a curated set of prompts with known answers the system must pass), and targeted sampling (review long or multi-turn conversations and those with low confidence scores). AI observability guidance consistently encourages a mix of automated checks and human review because hallucinations can be subtle and context-specific.
Teach your reviewers what to look for: contradictions, missing citations, misapplied policies, or invented figures. Keep a “hall of shame” for patterns you’ve seen so you can quickly spot and fix recurrences. Pair this with prompt and retrieval fixes-e.g., stricter system messages, top-k tuning for retrieval, and adding guardrails that require ground truth before answering. Most hallucinations are fixable with better retrieval and stricter answer rules; monitor to prove your fixes stick.
Aligning metrics with business outcomes that matter to sales and operations
If a metric doesn’t connect to your sales process or support workflow, it won’t drive decisions. For a buyer Q&A assistant, set a target to reduce median response time while keeping relevance above your threshold. For tenant support, aim to decrease handoffs to humans with no compromise on safety. For investor relations, measure the share of inquiries resolved with cited sources and zero policy violations. Customer service quality programs provide precedents for outcome-oriented monitoring where quality checks inform coaching and process changes, not just scores. When leaders see how a 5-point rise in usefulness translates into faster deals, they rally behind the work.
Governance, audit, and data minimization without slowing delivery
Real estate firms often work under strict privacy, fair housing, and financial guidelines. Monitoring supports governance when it is auditable. Store versioned prompts, model IDs, configuration flags, and index digests, along with timestamps and owners. Keep a record of alert events and resolution steps. Versioning and lineage become invaluable when regulators or clients ask “what changed and why.”
Data minimization plays a role here too. Mask PII in logs where possible, set retention periods, and segment access by role. For document processing, restrict raw document access to pipelines and redact sensitive fields before storing outputs for analytics. Data quality governance best practices highlight how clear ownership and policies improve both reliability and compliance outcomes. Good governance is not bureaucracy; it is how you move fast with fewer surprises.
The human review loop: who does it, how often, and what they look for
A 1-5% weekly manual review rate is feasible for most teams and pays dividends in catching nuanced issues. Select reviewers from operations or subject-matter experts who understand the business context-leasing managers, property admins, or investor relations associates. Provide a short rubric: relevance, usefulness, safety, and whether the answer used the right source (for RAG). Capture quick comments and tags.
AI monitoring practitioners frequently note that regular human evaluations keep automation honest and reveal drift early. Stabilize the process with a calendar slot: a one-hour weekly “AI quality huddle” where reviewers share patterns, product owners decide fixes, and engineering plans updates. Small, consistent reviews prevent big, sporadic emergencies.
Tooling notes: where to automate and where to stay manual
High-performing teams automate data collection, scoring, and alerting, while keeping judgment and prioritization in human hands. Even large shops start with scripts and spreadsheets and layer in platforms as volume grows. Wire up CI/CD for prompts and models, implement champion/challenger traffic splitting, and capture artifacts for audit at scale. Copilots help with version tracking and comparisons, while less technical stakeholders rely on dashboards and alert streams, not raw logs.
Do not forget observability for data pipelines. Changes in listing formats, municipal data feeds, or OCR output can subtly alter extraction accuracy. Use schema checks, anomaly detection on field distributions, and source freshness alerts to prevent silent failures. If data drifts, the model often gets blamed; watch the pipes, not just the brain.
What makes AI quality monitoring different from traditional IT-and why that matters for your property business
In software, a function either returns the right value or throws an error. In language systems, many wrong answers look fluent, timely, and plausible. That is why uptime and API error rates are poor proxies for service quality in LLM systems. Monitoring has to “read” the output in the same way a human would and judge whether it meets business expectations. AI observability discussions emphasize semantic evaluation, business outcome tracking, and human sampling to catch what logs cannot.
For a real estate brand, this is not academic. A chatbot that never crashes but recommends the wrong neighborhoods or mishandles a sensitive inquiry damages trust quietly and steadily. Quality monitoring makes the invisible visible. It is the difference between believing everything is fine and knowing where to improve.
A checklist for rollbacks, without drama
When a rollback trigger fires, execution speed matters. Keep a practical checklist ready:
- Confirm the trigger: metric drop > threshold for the defined period, or a safety incident above severity level.
- Revert: switch traffic to the previous stable model or prompt; for RAG, revert to the previous index version.
- Verify: run a short regression suite-including red team prompts and benchmark questions-on the restored version.
- Communicate: notify stakeholders of the rollback and expected temporary behavior.
- Investigate: open an incident doc, capture context, and assign owners for root-cause analysis.
Guidance on deployment strategies urges teams to make rollbacks a first-class citizen of release planning, not an afterthought. Combine this with alerting practices that are actionable and owned, not just noisy. Your best rollback is one you can start and complete in minutes, not hours.
Frequently asked questions
Can this be automated?
Yes. Start with simple scripts and spreadsheets to compute relevance/usefulness, select samples, and send alerts. As traffic grows, adopt observability platforms and MLOps workflows that automate versioning, champion/challenger testing, and rollbacks.
How many cases should we evaluate manually?
Review 1-5% of sessions weekly. Use a higher rate for new features, new property types, or fresh data sources. This cadence helps you catch nuanced issues early.
How do we detect hallucinations?
Combine red-flag rules, targeted sampling of longer and low-confidence sessions, and periodic benchmark tests that the system must pass before and after releases. A mix of automation and human review is necessary to surface subtle hallucinations.
Will the business understand the metrics?
Yes, if you define them in plain language and show “good vs. bad” examples for your use cases. Training non-technical stakeholders with examples makes conversations grounded in outcomes.
When should we perform a rollback?
If relevance or usefulness drop by more than an agreed threshold-often 5%-for 48 hours, or if a safety incident above a predefined severity occurs, roll back and investigate. Design deployments so rollbacks are fast and auditable.
Common mistakes to avoid so your LLM, RAG, and agents don’t drift
Assuming traditional IT monitoring is enough. Uptime and latency metrics cannot tell you if the answer is correct, safe, or helpful. Quality issues are often invisible to infrastructure metrics.
Relying only on technical accuracy. In real estate workflows, usefulness and safety matter as much as raw accuracy. Tying metrics to task completion and policy compliance ensures the AI supports sales and compliance goals, not just model benchmarks.
Skipping regular reviews. Drift appears slowly, often via new property categories, updated lease templates, or emerging user intents. Without a human review loop, you spot trends late. Continuous monitoring and feedback beat one-time audits.
Promoting models without rollback plans. Even promising challengers can underperform in production. Version everything and make rollbacks part of the release plan.
Forgetting governance and data minimization. Store only what you need, mask PII, and keep an audit trail. Clear ownership and policies make monitoring simpler and compliance checks smoother.
Bringing it all together with iMakeable: real projects, real guardrails
At iMakeable, our consulting and software team builds practical AI in business solutions for real estate and adjacent industries-LLM assistants for buyer Q&A, RAG search across listings and due-diligence rooms, lead-qualification agents that sync with CRM, and AI PDF extraction for leases and broker packs. We define relevance/usefulness/safety in your language, set thresholds and alerting policies, and instrument sampling so leaders see what’s happening without reading raw logs. We wire feedback into your release process and set up champion/challenger paths so changes are data-driven.
For a property developer, we recently combined a RAG assistant with document citations and a weekly human review on 3% of sessions; the team saw a marked decrease in escalations while keeping compliance reviewers comfortable with source-based answers. For a broker network, a lead-qualification agent improved viewing scheduling while enforcing data minimization, storing only what the CRM needed.
We also help set up rollbacks, lock-release mechanisms, and documentation that satisfy internal audit and client demands. The outcome is a steady program where improvements ship often, issues are caught early, and everyone-from sales to compliance-can trust the AI is doing the right work.
If you want to discuss how to introduce quality metrics, alerts, human sampling, champion/challenger testing, and safe rollbacks into your LLM, RAG, or agent projects-or if you’re planning a lead-qualification agent or AI PDF extraction pipeline for leases-reach out to iMakeable for a free consultation. We’ll review your use case, suggest a step-by-step monitoring plan you can launch in weeks, and help you scale it with the right tooling when you’re ready.
What can we do for you?
Web Application Development
Build Lightning-Fast Web Apps with Next.js
AI Development
Leverage AI to create a new competitive advantage.
Process Automation
Use your time more effectively and automate repetitive tasks.
Digital Transformation
Bring your company into the 21st century and increase its efficiency.


Real Estate AI Voice Agents: Implementation Guide & Best Practices
Learn how to implement AI voice agents in real estate to automate inquiries, schedule appointments, and boost sales efficiently.
12 minutes of reading

Sebastian Sroka
02 September 2025

How Are AI and Data Science Transforming the Real Estate Market?
Discover how AI and Data Science are reshaping real estate – from price forecasting and building management to offer personalization. Learn more now!
8 minutes of reading

Oskar Szymkowiak
08 January 2025
