Gemini Enterprise Photo-Driven Router Setup Assistant

AI/ML
About the Task
Built a Gemini Enterprise assistant that reads a router photo — model, ports, cabling, LEDs — and returns grounded fixes
results
Setup time cut from 25–40 min to 10–15 min; self-install completion up from 60–70% to 80–90%
results
Support calls down from 180–250 to 80–120 per 1,000 installs; first-contact resolution up to 75–85%
Services used
Build Strategy
Build Product
Web Development

The table of content

Overview

Setting up a new router is one of the few tasks customers are expected to handle entirely on their own — often while their internet is down and their patience is running out. They face a box of cables, a device covered in look-alike ports, and a row of blinking lights that mean nothing to most people. The usual help falls short: a printed manual assumes you know the terminology, and a chat window can't see the device in front of you. The customer is left describing a visual problem in words, and the agent is left guessing.

This use case describes a different approach, built on Gemini Enterprise: instead of asking customers to explain what they see, the assistant lets them show it. The result is a faster setup for the customer and a lighter support operation for the operator — and the sections that follow walk through how it works end to end.

Business Challenge

Router setup is fundamentally a visual task, but the tools built to support it are text-first — and that mismatch is the root cause of the friction.

When a customer runs into trouble, they have to translate what they see — port assignments, label markings, LED colors and blink patterns, which cable goes where — into words they often don't have. Most people don't know a WAN port from a LAN port, or that a slow blink means something different from a steady light. So the description that reaches a manual, a search box, or a chatbot is vague and often wrong, and the guidance that comes back is generic. Text-only support makes this worse because it is blind to device state: it can only respond to what the customer types, which is exactly the part they struggle to get right.

The cost shows up across the whole setup funnel:

  • Setup time: 25–40 minutes for a first-time install.
  • Unaided completion: only 60–70% of customers finish without contacting support.
  • Support load: 180–250 calls per 1,000 installs, many of them repeat contacts.
  • Truck rolls: a non-trivial share escalate to an on-site technician visit — the most expensive outcome — often for trivial faults like a WAN cable in the wrong port.

Each is a consequence of the same gap: the customer can see the problem but can't describe it, and the support channel can't see anything at all. Closing that gap means letting the assistant observe the device directly.

Solution

Photo in, guided fix out — that is the whole interaction model. Built on Gemini Enterprise, the assistant uses multimodal reasoning to interpret an ordinary phone photo the way a skilled technician would. From a single image it extracts the device model, port layout, label markings, cabling, and LED states. It then grounds that analysis in the operator's own knowledge — product manuals, SOPs, firmware notes, the model-specific LED matrix, and a known-issues database — and validates the device against the operator's registry. The output is not a generic article but a step-by-step instruction tailored to the exact device and problem it can see.

The flow follows a consistent pattern:

  • Capture — the customer consents, then photographs the router (back panel for setup, front panel for connectivity issues).
  • Identify — Gemini recognizes the model, ports, cabling, and labels, with a confidence score.
  • Ground — the assistant retrieves operator-specific guides and validates against the device registry.
  • Guide — it generates clear, personalized setup or troubleshooting steps.
  • Escalate when needed — if confidence is low or the issue can't be resolved, it requests a clearer photo or hands a structured case to a human agent.

Two design choices make this dependable rather than just clever. First, the assistant checks its own confidence: a blurry or badly angled photo triggers a re-capture request instead of a wrong answer. Second, it never guesses beyond what it can support — low confidence becomes a clarifying question or a human handoff, not a fabricated fix.

Executive architecture

User Journey

From the customer's side, it feels like a guided conversation, not a technical procedure — the assistant does the interpretation in the background.

  • Starting out. The customer opens the app, selects "Set up my router," and grants explicit consent for camera use and diagnostic processing.
  • Taking the photo. They photograph the router's back panel; the image uploads securely and Gemini identifies the model, port labels, cables, and markings.
  • Confidence check. If the photo is blurry, dark, or badly angled, the assistant asks for a clearer shot instead of guessing.
  • Personalized guidance. Once confident, it validates the device against the registry, pulls the operator's setup and activation flow for that exact model, and generates step-by-step instructions tailored to what it sees.
  • Connectivity issues. If the internet still isn't working, it requests a photo or short video of the front-panel lights, then reads the LED state and blink pattern against the model's LED matrix and firmware notes to pinpoint the cause.
  • Resolution or handoff. It either confirms success and closes the session, or opens a ticket with a structured diagnostic summary so a human inherits full context instead of starting from scratch.
  • Closing the loop. Every outcome is logged as anonymized metadata that feeds product and operational improvement.

The throughline is graceful escalation: at each stage the assistant advances the customer, asks for exactly what it needs, or hands off cleanly — it never fails silently.

Technology Architecture

This section describes the platform end to end — how a request enters, how the system is partitioned, how the agents reason, how the two main flows run, and how data is protected and deployed.

Request & Intake Path

Before any reasoning happens, every request passes through a layered intake path that handles where it came from, who the customer is, and how the image reaches the platform.

  • Channels. Requests enter from the telecom mobile app, a web support portal, messaging channels (WhatsApp, KakaoTalk, RCS), and an internal support-agent console — all feeding the same backend.
  • Edge and gateway. Traffic hits a hardened public edge (Cloud Load Balancing with a CDN, fronted by Cloud Armor as a WAF), then passes through Apigee / Cloud API Gateway, which enforces OAuth2 / OIDC and API-key checks.
  • Identity and consent capture. Authentication resolves against the operator's existing identity providers (Firebase Auth, telecom SSO, or customer IAM), and a dedicated Consent Capture Service records authorization as a discrete, auditable event.
  • Image upload. Images never flow through the general request path: a Cloud Run service issues a short-lived, single-use signed upload URL, and the device writes the photo directly to a dedicated Cloud Storage bucket for temporary router images.

Together, these layers turn a request from any channel into a clean, authenticated, consented session with an image in a known location.

Enterprise Architecture

The system is divided into distinct trust zones, each with a clear responsibility and governed boundaries between them. Nothing crosses from one zone to the next without passing through an explicit interface.

  • Customer zone — the device and channels; untrusted by default.
  • Public edge and API protection — the hardened perimeter; the only zone exposed to the public internet.
  • Application zone — stateless services that orchestrate sessions and business logic; holds no customer data at rest.
  • Enterprise and operator systems — CRM, device registry, telecom provisioning, warranty, ticketing/field service; reached only through controlled integrations.
  • AI and agent zone — Gemini Enterprise, Vertex AI models, and the custom diagnostic agents; isolated so model behavior can be governed independently.
  • Data and knowledge zone — temporary image storage, grounded knowledge stores, and the analytics warehouse.
  • Operations and security zone — IAM, network controls, audit logging, and monitoring.

The benefit is containment: a request from the untrusted customer zone can reach sensitive systems only through a chain of explicit boundaries, so a problem in one zone doesn't propagate to the others.

Enterprise architecture zones

Agent Orchestration

The reasoning layer is a pipeline of specialized agents, each with a narrow job, coordinated by Gemini Enterprise as orchestrator; Vertex AI Gemini serves inference and the Vertex AI Agent Engine hosts the custom diagnostic agents.

  • Entry gate. The Policy and Safety Agent screens every request for unsafe instructions and confirms data-handling permissions before any diagnosis runs.
  • Parallel analysis. Three specialists examine the photo simultaneously: the Visual Device Identification Agent (model, revision, serial/QR), the Cable Validation Agent (cabling faults), and the LED Diagnostic Agent (indicator states).
  • Fusion and gating. A Diagnostic Fusion Step combines the three results, followed by a confidence and risk check — low confidence routes back for a clearer photo or human review; high confidence proceeds.
  • Grounding and response. The Knowledge Retrieval Agent pulls the relevant manuals, SOPs, LED matrix entries, and known issues, and the Grounded Response Generator turns the diagnosis plus that evidence into customer-facing steps. A Final Safety and Compliance Check screens the output before the customer ever sees it.
  • Escalation path. When human help is warranted, the CRM Escalation Agent and Human Handover Agent assemble a structured summary and route it into ticketing, CRM, or field service.
  • Learning. The Analytics Agent records the outcome for downstream analysis.

The principle throughout is decomposition with control: many narrow agents, a fusion step to reconcile them, confidence gating, and safety checks guarding both the entrance and the exit.

Diagnostic Flows

Two flows cover the majority of traffic, each choreographed so every system is called at the right moment with the least data necessary.

First-time setup sequence

  • Authenticate the customer and capture consent, then create a session and return a session ID.
  • Issue a signed upload URL; the device uploads the image, raising an image-uploaded event that notifies the orchestrator.
  • The orchestrator sends the prompt, context, and image reference to Gemini, which returns a structured JSON result (model, ports, labels, cabling, confidence).
  • Validate the serial or model against the device registry.
  • Retrieve grounding content — manual, LED matrix, operator SOP.
  • Check line and device activation via telecom provisioning when needed.
  • Return a recommended action with a confidence score, and store anonymized metadata.
  • Branch to the outcome: confirm a successful setup, or open a ticket carrying the diagnostic summary; record the final outcome.

Setup data flow

LED diagnostic flow

When a customer reports that the internet still isn't working, the system shifts to fault diagnosis driven by the front-panel lights.

  • Gemini analyzes the visible LEDs from the front-panel photo.
  • If a still image can't capture the state — for example, a blink — the assistant requests a five-second video and analyzes the pattern.
  • It classifies the LED status, retrieves the model-specific LED matrix, and compares against firmware notes and known issues.
  • The comparison resolves to a category with a specific action: power off/booting → safe power-on; WAN disconnected → reseat the WAN cable; authentication failed → check the telecom activation API; firmware update in progress → don't power off; Wi-Fi disabled → re-enable Wi-Fi; hardware fault suspected → escalate to replacement.
  • The outcome is stored.

Reading the blink pattern, not just the color, is what separates ambiguous states from definite ones — and the "do not power off during a firmware update" branch alone prevents one of the most common ways a customer bricks a new device.

LED diagnostic flow

Security & Privacy Controls

A router photo is sensitive data — it can contain a serial number, a MAC address, a QR code, and an incidental view of the customer's home — so controls apply at every stage. Privacy here is enforced by the architecture, not left to policy.

  • On the way in. Every session is anchored by an explicit consent record, and all traffic travels under TLS.
  • Encryption (CMEK). Data at rest is encrypted with Cloud KMS using customer-managed keys.
  • Detection and redaction. Cloud DLP scans for PII, MAC addresses, and serial numbers, masking or redacting them before they propagate.
  • Access control. IAM and RBAC restrict every service and person to the minimum access required.
  • Network isolation. VPC Service Controls fence the sensitive zones even if credentials are compromised.
  • Model protection. Model Armor screens for prompt injection and unsafe content before and after generation.
  • Auditability. Audit logs and Access Transparency record who and what touched the data.
  • Automatic deletion. Temporary images are deleted within 24–72 hours by lifecycle policy.
  • Anonymized analytics and minimum-necessary handoff. Only anonymized metadata reaches analytics, and escalations carry only the data required to act, through a separate restricted support environment.

The principle is consistent: collect the minimum, protect it in depth while in use, and retire it on a schedule.

Security & privacy architecture

Data Model

The data model is organized around one central entity — the diagnostic session — with every other record linked back to it, giving the operator a complete, queryable record without retaining more than necessary.

DIAGNOSTIC_SESSION carries the identifying and outcome fields: a session_id key, a hashed customer_id, operator_id, country, language, consent_timestamp, and the flags that capture how the session ended (human_escalation, image_deleted, outcome). Six entities reference it via session_id:

  • DEVICE — model, hardware revision, firmware, masked serial, hashed MAC, batch.
  • IMAGE_ANALYSIS — confidence scores, detected ports and LEDs, cable status, image quality.
  • DIAGNOSIS — category, root cause, recommended action, resolved flag, confidence score.
  • KNOWLEDGE_REFERENCE — the IDs of the grounding sources used (manual, firmware note, LED matrix, SOP, known issue).
  • ESCALATION_TICKET — present only when a case is handed to a human.
  • ANALYTICS_EVENT — the anonymized signals that feed analytics.

The design is auditable (every diagnosis links to the knowledge that grounded it), privacy-preserving by default (IDs hashed, serials masked at the schema level), and analytics-ready (confidence, categories, and outcomes captured as structured fields).

Data model

Deployment Topology

Single Korea production region, built almost entirely from managed, independently scalable Google Cloud services:

  • Cloud Run (stateless app tier): Session API, Image Intake API, Prompt Builder, Diagnostic Orchestrator, Case Integration API, Feedback Collector.
  • AI & agents: Vertex AI Gemini (inference) and Vertex AI Agent Engine (custom ADK agents); grounding from Gemini Enterprise data stores.
  • Messaging, state & secrets: Cloud Tasks, Pub/Sub, Secret Manager.
  • Data: BigQuery (anonymized metadata), Cloud Storage (temporary image bucket), Cloud KMS + Cloud DLP (encryption and sensitive-data scanning).
  • External systems: CRM, ServiceNow / Zendesk / Salesforce, Field Service, Device Registry, Telecom Provisioning, Warranty — via controlled integrations.
  • Restricted support env: Support Agent Console with its own RBAC and a Redaction Service.
  • Observability: Cloud Logging, Cloud Monitoring, Looker dashboards.

Results and Impact

Measured Impact

Four levers, shown as before → after against the original baselines:

  • Setup time: 25–40 min → 10–15 min.
  • Self-install completion: 60–70% → 80–90% — fewer abandoned activations and returns.
  • Support calls / 1,000 installs: 180–250 → 80–120 — frees capacity for complex issues.
  • First-contact resolution: 55–65% → 75–85% — cases arrive pre-diagnosed with full context.
  • Cabling solved without an agent: target 70%+.
  • Product intelligence: surfaces confusing layouts, firmware failure patterns, and setup-driven returns — compounding over time.

Together this makes self-service the fastest, cheapest, most reliable path — and increasingly capable as the feedback loop matures.

Conclusion

The router is the example, but the pattern is the point. The same approach fits any product a customer must set up or troubleshoot on their own — appliances, medical devices, networking gear, smart-home and industrial hardware. Wherever there's a gap between what a person can see and what they can describe, a multimodal assistant that observes the device directly closes it.

Stripped to essentials, the reusable loop is:

  • Multimodal capture — the customer shows the device with a photo or short video instead of describing it.
  • Grounded diagnosis — interpret it against the manufacturer's own knowledge and validate against systems of record.
  • Confidence-gated guidance — return a fix only when sure enough; otherwise ask for clearer input or defer.
  • Structured escalation — hand a fully documented case to a human rather than starting from zero.

What makes it deployable is the discipline around it: consent before processing, sensitive data masked and encrypted under operator keys, inputs retired on schedule, minimum-necessary handoffs, and an anonymized analytics loop. The knowledge sources, diagnosis categories, and constraints change by industry, but the architecture doesn't — the same template re-points at new hardware without a redesign. The router assistant is one instance of a general capability: turning a customer's camera into a diagnostic instrument, and trusted enterprise knowledge into the expertise that reads it.

Build Strategy
Build Product
Web Development

Our success stories

Gemini Enterprise Photo-Driven Router Setup Assistant

Multimodal AI on Gemini Enterprise lets customers photograph their router instead of describing it for guided self-setup
AI/ML

AI-Driven Network Planning and Capacity Expansion for Mobile and Fixed Telecom Networks

AI-driven network planning platform for telecom operators with predictive demand and capex optimization
AI/ML
IoT

AI-Driven Telecom Fraud Detection & Prevention Platform

AI-driven real-time fraud detection and prevention platform for telecom networks
AI/ML
CRM/ERP

AI-Driven Predictive Field Maintenance for Towers & RAN Equipment

AI-driven predictive maintenance and field operations optimization platform for telecom network infrastructure
AI/ML
IoT

AI-Driven Next Best Action (NBA) Engine for BSS

AI-driven revenue assurance platform for telecom BSS with real-time anomaly detection and automated correction
AI/ML
CRM/ERP

NetAssure AI — Autonomous Service Assurance for RAN/Core

AI-powered closed-loop network operations platform for telecom service assurance and energy optimization
AI/ML
CRM/ERP
IoT

AI-Driven Predictive Maintenance for Rotating Equipment


AI predictive maintenance platform for rotating equipment at a gas processing facility
AI/ML
IoT
Web Development

“Explainable line” copilot (LLM over event log + manuals)

Explainable AI copilot that turns PLC logs and manuals into clear explanations and troubleshooting guidance
AI/ML
IoT
Web Development

Operator behavior & training insights

Operator coaching and best-practice analytics using HMI/PLC interaction data to stabilize performance across shifts
AI/ML
IoT
Web Development

Safety & Near-Miss Analytics for Industrial Production Lines

Safety & near-miss analytics system using PLC safety signals and AI scenario detection for industrial production lines.
AI/ML
IoT
Web Development

Automatic parameter recommendation (“recipe optimization”)

AI-based recipe optimization system for automatic tuning of temperatures, speeds, and pressures on production line.
AI/ML
IoT
Web Development

Quality Analytics: Veneer Thickness, Cut Quality & Defects

AI-driven veneer thickness and cutting quality analytics for production line
AI/ML
IoT
Web Development

Buffer & Bottleneck Optimization Across the Production Line, Storage, and Lift System

Digital twin and AI optimization for buffer flow and bottleneck management on production line
AI/ML
IoT
Web Development

Predictive maintenance of drives & motion axes

Predictive maintenance solution for drives and motion axes on production line.
AI/ML
IoT
Web Development

Predictive maintenance for heating & glue system

Predictive maintenance solution for the Heating & Glue system on production line.
AI/ML
IoT
Web Development

Full OEE and Lost‑Hours Analytics for Production Line

Digital OEE and lost-hours analytics solution for high-throughput veneer production line
AI/ML
IoT
Web Development

Automatic stop detection & classification: Micro-stop Analytics

Advanced micro-stop analytics with ML-assisted classification and root-cause insights
AI/ML
IoT
Web Development
CRM/ERP

Automatic stop detection & classification: Microstop

Micro-stop Monitor detects and classifies short production stops using PLC data and rules
AI/ML
IoT
Web Development

AI Regulatory & Licensing Compliance Copilot

AI copilot for regulatory and licensing compliance across multiple jurisdictions
AI/ML
CRM/ERP
Web Development

AI Due Diligence Platform for M&A and New Projects

AI platform for automated M&A and new project due diligence
AI/ML
CRM/ERP
Web Development

Autonomous Dispatch & BESS AI Optimization

AI engine for portfolio dispatch and BESS optimization across volatile energy markets
AI/ML
IoT
Web Development

AI Asset Health & Degradation Prediction System

AI system for asset health monitoring and degradation prediction
AI/ML
IoT
Web Development

AI CO ₂ Calculator & ESG Impact Platform

AI-powered platform for automated CO₂ accounting and ESG reporting
AI/ML
Web Development

Cross-recipe: Energy vs Quality Analysis

A data-driven system optimized veneer press energy usage while maintaining product quality.
AI/ML
IoT
Feedforward Press Correction

Feedforward Press Correction

A leading engineered wood manufacturer implemented a predictive press control system powered by data and machine learning.
AI/ML
IoT
Web Development
“Bad-Sheet” Routing

“Bad-Sheet” Routing

Automated system for detecting and routing defective veneer sheets using real-time sensor data and analytics.
AI/ML
IoT

Early Fan Failure Detection

Plant A deployed an on-prem predictive maintenance system for fans, reducing unplanned downtime by 38%.
AI/ML
Predictive Hydraulic Filter Change

Predictive Hydraulic Filter Change

Predictive maintenance system for hydraulic filters reduced downtime and optimized maintenance scheduling in a large industrial plant.
IoT
AI/ML

Infinity Technologies in PetTech

A smart genetic testing platform that helps pet owners and breeders easily access and understand their pets’ DNA insights through a single digital solution.
AI/ML
IoT
Mobile Development

Intelligent Budgeting: How AI-Powered Financial Planning Transforms Business Strategy

A case study on how intelligent budgeting transformed financial planning, decision-making, and organizational agility.
AI/ML
CRM/ERP
Smarter Product Management Through Interactive Constructors and Real-Time Analytics

Smarter Product Management Through Interactive Constructors and Real-Time Analytics

An interactive, analytics-powered product constructor enabled smarter pricing, faster product decisions, and improved profitability across a complex portfolio.
AI/ML
CRM/ERP
The Power of Precision: How One Company Achieved 99.1% Sales Forecast Accuracy

The Power of Precision: How One Company Achieved 99.1% Sales Forecast Accuracy

A large-scale sales forecasting system achieved 99.1% accuracy across hundreds of products using data-driven, automated models.
AI/ML
CRM/ERP
Smarter Energy Forecasting in Manufacturing

Smarter Energy Forecasting in Manufacturing: Turning Data Into Cost Savings

A real-world case study on how predictive energy forecasting helps manufacturers cut costs and improve efficiency.
IoT
AI/ML
CRM/ERP
Smarter Hatching: How Predictive Modeling Transforms Poultry Incubation

Smarter Hatching: How Predictive Modeling Transforms Poultry Incubation

A poultry farm used AI and real-time data to optimize incubation, improving chick quality and operational efficiency.
AI/ML
IoT
Web Development
Predictive Analytics in Healthcare: The Future of Cardiovascular Risk Detection

Predictive Analytics in Healthcare: The Future of Cardiovascular Risk Detection

Predictive analytics model for early cardiovascular risk detection using non-invasive population data.
AI/ML
CRM/ERP
Smart Fraud Detection: How Predictive Analytics is Reshaping Social Welfare Systems

Smart Fraud Detection: How Predictive Analytics is Reshaping Social Welfare Systems

A public agency used predictive analytics to overhaul fraud detection in social welfare distribution.
AI/ML
Risk-Based Oversight of Social Benefits: Catching Fraud Without Hiring More Staff

Risk-Based Oversight of Social Benefits: Catching Fraud Without Hiring More Staff

Case study: shifting from random checks to risk-based fraud detection in social benefits.
AI/ML
Predicting Employee Turnover: How Data Turns Retention into a Strategy

Predicting Employee Turnover: How Data Turns Retention into a Strategy

This article explores how predictive analytics is transforming employee retention from a reactive process into a strategic advantage.
AI/ML
IoT
CRM/ERP
How Predicting Customer Churn Helps Banks Grow: A Case Study with 1500% ROI

How Predicting Customer Churn Helps Banks Grow: A Case Study with 1500% ROI

A real-world case study showing how predictive analytics helped a bank cut churn by 71% and achieve 1500% ROI through targeted retention.
AI/ML
CRM/ERP
Smarter Compliance: How Automated Risk Assessment Transforms Contractor Fraud Detection in Banking

Smarter Compliance: How Automated Risk Assessment Transforms Contractor Fraud Detection in Banking

This article explores how automated risk classification enhanced fraud detection and compliance efficiency in banking.
AI/ML
Smarter Loan Campaigns with Predictive Models

Smarter Loan Campaigns with Predictive Models

How predictive analytics helps banks improve cross-selling by reducing risk, cutting waste, and targeting the right customers.
CRM/ERP
AI/ML
Predictive Modeling Cuts Marketing Costs by 93% in Banking Campaign

Predictive Modeling Cuts Marketing Costs by 93% in Banking Campaign

A bank applied predictive modeling to identify high-response customers, reducing campaign costs from full budget to just 7% while maintaining results.
AI/ML
Risk-Based Personalization Boosts SME Overdraft Lending

Risk-Based Personalization Boosts SME Overdraft Lending

A major European bank revamped its SME overdraft lending by introducing a data-driven model that adjusted loan limits based on individual risk profiles, boosting both portfolio size and profit.
AI/ML
CRM/ERP
From 4 Months to 30 Minutes: The New Speed of Credit Scoring
August 2025

From 4 Months to 30 Minutes: The New Speed of Credit Scoring

A bank cut credit model time from four months to 30 minutes by automating risk assessment for corporate clients.
AI/ML
Nova Poshta: AI-Powered Warehouse Monitoring for Conveyor Systems

Nova Poshta: AI-Powered Warehouse Monitoring for Conveyor Systems

Infinity Technologies Builds Real-Time Load Balancing and Bottleneck Detection for Ukraine’s Largest Logistics Operator
AI/ML
CRM/ERP
IoT
Web Development