Building a career copilot with Java 21 and LangChain4j

When I started thinking about how to navigate career decisions more deliberately, I realized the problem was not a lack of information — it was a lack of structured reasoning over my information. Job descriptions, skills gaps, community feedback, conference talks, open-source work: all of it exists in silos. Bumblebee is my attempt to wire it together using Java 21, LangChain4j, and the ReAct agent pattern.

Why Java for AI agents?

The Python ecosystem dominates AI tooling, and for good reason. But if your team already runs Spring Boot microservices, introducing a Python service for AI features adds operational complexity: separate deployments, another runtime, bridging HTTP calls. LangChain4j brings the same chain-of-thought and tool-calling primitives to the JVM natively, letting you build production-grade AI agents without leaving the ecosystem you already know.

Java 21’s virtual threads (Project Loom) are a bonus: LLM API calls are slow I/O, and virtual threads let you run hundreds of concurrent agent steps with a tiny thread-pool — no reactive programming required.

The ReAct pattern in plain terms

ReAct (Reasoning + Acting) is a loop:

Reason — the model thinks out loud about what it knows and what it needs.
Act — it calls a tool (a function you provide) to fetch missing information.
Observe — it reads the tool’s result and folds it back into reasoning.
Repeat until it has enough to give a final answer.

This is the foundation Bumblebee runs on. Every career question — “What skills should I develop next?”, “Is this job offer aligned with my trajectory?” — becomes a chain of tool calls over structured data about my own career.

Project setup

Add LangChain4j to your pom.xml:

<dependency>
  <groupId>dev.langchain4j</groupId>
  <artifactId>langchain4j-spring-boot-starter</artifactId>
  <version>0.33.0</version>
</dependency>
<dependency>
  <groupId>dev.langchain4j</groupId>
  <artifactId>langchain4j-ollama-spring-boot-starter</artifactId>
  <version>0.33.0</version>
</dependency>

Configure the model in application.yml:

langchain4j:
  ollama:
    chat-model:
      model-name: llama3.2
      base-url: http://localhost:11434
      temperature: 0.3
      timeout: 120s

The low temperature (0.3) keeps the agent deterministic — you want consistent reasoning, not creative hallucinations, when analyzing career data.

Defining the agent interface

LangChain4j’s @AiService annotation turns a plain Java interface into a wired agent. You annotate it with a system prompt and declare which tools it can use:

@AiService
@SystemMessage("""
    You are Bumblebee, a career copilot for a senior backend software engineer.
    You help analyze skills gaps, evaluate opportunities, and plan next steps.
    Use the tools available to retrieve career data before answering.
    Reason step by step. Be specific and cite data when possible.
    """)
public interface CareerCopilot {

    String chat(@UserMessage String userMessage);
}

Spring Boot auto-configures the bean at startup — no manual wiring needed.

Building the tools

Tools are methods annotated with @Tool. The agent decides when to call them based on their description. Here are three core tools Bumblebee uses:

@Component
public class CareerTools {

    private final SkillRepository skillRepository;
    private final ExperienceRepository experienceRepository;
    private final MarketSignalService marketSignalService;

    @Tool("Returns the engineer's current skill inventory with proficiency levels")
    public List<Skill> getCurrentSkills() {
        return skillRepository.findAll();
    }

    @Tool("Returns work experience entries sorted by recency, including role, company, and key technologies")
    public List<Experience> getWorkHistory() {
        return experienceRepository.findAllByOrderByStartDateDesc();
    }

    @Tool("Searches current job market signals for a given technology or role title")
    public MarketSignal getMarketSignal(@P("technology or role title to research") String query) {
        return marketSignalService.fetchSignal(query);
    }
}

The @P annotation names parameters so the model knows what to pass. Descriptions are the most important part: the agent reads them to decide which tool fits the current reasoning step.

Virtual threads and concurrent tool calls

When the agent needs multiple tools in a single reasoning cycle, you want them to run in parallel. With virtual threads, this is trivial:

@Configuration
public class AsyncConfig {

    @Bean
    public Executor taskExecutor() {
        return Executors.newVirtualThreadPerTaskExecutor();
    }
}

Each tool invocation runs on its own virtual thread. A call that fetches skills and a call that hits an external market API run concurrently without blocking any OS thread. Under load, this scales to thousands of parallel agent steps with minimal memory overhead — something that would require a reactive pipeline in older Java.

Putting it together: a sample session

Here is what a real Bumblebee session looks like internally. The user asks:

“Should I invest time in learning Quarkus or deepen my Spring Boot expertise?”

The agent’s internal ReAct trace:

Thought: I need to know the engineer's current Spring Boot and Quarkus experience before advising.
Action: getCurrentSkills()
Observation: [Spring Boot: Expert, Quarkus: Beginner, Micronaut: None]

Thought: I should check market demand for both frameworks.
Action: getMarketSignal("Quarkus Java backend")
Observation: {trend: "rising", jobPostings30d: 1240, yoy: "+34%"}

Action: getMarketSignal("Spring Boot Java backend")
Observation: {trend: "stable dominant", jobPostings30d: 18400, yoy: "+8%"}

Thought: Spring Boot dominates volume but Quarkus is the fastest-growing JVM framework.
Given existing Spring Boot expertise, deepening it has high ROI for current market.
Quarkus is worth a focused 2-month investment given the growth curve and GraalVM native story.

Final Answer: Given your expert-level Spring Boot background and Quarkus' 34% YoY growth...

The model never hallucinates skill levels or market data because it always fetches them from tools before reasoning over them. This is the key discipline of agentic systems: ground every claim in retrieved data.

Lessons learned

Prompt the system message, not the user message. Behavioral rules (be specific, cite sources, reason step by step) belong in the system prompt. The user message should be pure question.

Tool descriptions are load-bearing. A vague @Tool("gets data") leads to random tool calls. A precise @Tool("Returns the engineer's current skill inventory with proficiency levels and last-updated timestamps") gives the model enough signal to call it at the right time.

Temperature matters for agents. High temperature produces creative but unreliable reasoning chains. For decision-support agents, 0.2–0.4 is the sweet spot.

Virtual threads make async feel synchronous. You write blocking I/O code as you always did in Java. The runtime handles concurrency. This is a genuine ergonomic win for AI workloads where every step involves network calls.

What’s next

Bumblebee will soon be open source. The roadmap includes a structured output layer (using LangChain4j’s @StructuredPrompt for typed responses), a memory store backed by pgvector for long-running career conversations, and a Vaadin UI so non-terminal users can interact with it.

If you are building AI features on the JVM, LangChain4j is ready for production. The Java 21 + Spring Boot + LangChain4j stack is more capable than the ecosystem’s reputation suggests — and virtual threads make the concurrency story genuinely better than Python’s async model for I/O-heavy agent workloads.