Close Menu
geekfence.comgeekfence.com
    What's Hot

    Customer experience management (CXM) predictions for 2026: How customers, enterprises, technology, and the provider landscape will evolve 

    December 28, 2025

    What to Know About the Cloud and Data Centers in 2026

    December 28, 2025

    Why Enterprise AI Scale Stalls

    December 28, 2025
    Facebook X (Twitter) Instagram
    • About Us
    • Contact Us
    Facebook Instagram
    geekfence.comgeekfence.com
    • Home
    • UK Tech News
    • AI
    • Big Data
    • Cyber Security
      • Cloud Computing
      • iOS Development
    • IoT
    • Mobile
    • Software
      • Software Development
      • Software Engineering
    • Technology
      • Green Technology
      • Nanotechnology
    • Telecom
    geekfence.comgeekfence.com
    Home»Big Data»Guide to Node-level Caching in LangGraph
    Big Data

    Guide to Node-level Caching in LangGraph

    AdminBy AdminOctober 29, 2025No Comments6 Mins Read5 Views
    Facebook Twitter Pinterest LinkedIn Telegram Tumblr Email
    Guide to Node-level Caching in LangGraph
    Share
    Facebook Twitter LinkedIn Pinterest Email


    If you’re learning LangGraph or exploring more about it then it’s nice to know about the pre-built node-level caching in LangGraph. Caching not only eliminates unnecessary computation but also fastens the latency. We’ll be looking at the implementation of the same in the article.  It’s assumed that you have an idea about agents and nodes in LangGraph as we won’t be focusing on that side of the story, so without any further ado let’s walk into the concepts and implementation. 

    What is Caching?

    Caching stores data in temporary storage so the system can retrieve it quickly. In the context of LLMs and AI Agents, it saves earlier requests and reuses them when the same prompts are sent to the model or agent. Because it’s not a new request, the system doesn’t charge for it, and the response arrives faster due to the temporary memory. When part of a prompt stays the same, the system reuses the previous response and generates a new one only for the additional part, which significantly reduces costs even for new requests.

    Caching parameters and memory 

    It’s important to know about the ttl (time to live) parameter which is used to define the amount of time (in seconds) the cache will remain in the memory. If we set ttl=None or leave it as it is then the cache will never leave the memory. 

    We need to specify a cache when compiling a graph. We’ll use InMemoryCache to store the node’s inputs and outputs that can be used later to retrieve the node’s previous response in this article. Alternatively, you can implement SqliteCache, redisCache or custom cache as well depending on the needs.  

    Caching in Action 

    Let’s implement Node-level-caching for a function that helps convert celsius to fahrenheit. 

    Step 1: Installations 

    !pip install langgraph 

    Step 2: Defining the Graph 

    We’ll first define the graph structure and a simple function that simulates a slow computation using time.sleep() to make the caching effect visible.

    import time 
    
    from typing_extensions import TypedDict 
    
    from langgraph.graph import StateGraph 
    
    from langgraph.cache.memory import InMemoryCache 
    
    from langgraph.types import CachePolicy 
    
    class State(TypedDict): 
    
       celsius: float 
    
       fahrenheit: float 
    
    builder = StateGraph(State) 
    
    def convert_temperature(state: State) -> dict[str, float]: 
    
       time.sleep(2) 
    
       fahrenheit = (state['celsius'] * 9/5) + 32 
    
       return {"fahrenheit": fahrenheit} 
    
    builder.add_node("convert_temperature", convert_temperature, cache_policy=CachePolicy(ttl=None)) 
    
    builder.set_entry_point("convert_temperature") 
    
    builder.set_finish_point("convert_temperature") 
    
    cache=InMemoryCache() 
    
    graph = builder.compile(cache=cache)

    Step 3: Invoking the Graph 

    Now, let’s invoke the graph multiple times and observe how the cache behaves.

    print(graph.invoke({"celsius": 25})) 
    
    print(graph.invoke({"celsius": 25}, stream_mode="updates"))  # Cached 
    
    print(graph.invoke({"celsius": 36}, stream_mode="updates"))  
    
    time.sleep(10) 
    
    print(graph.invoke({"celsius": 36}, stream_mode="updates"))  
    
    cache.clear() # clears the entire cache

    Output

    {'celsius': 25, 'fahrenheit': 77.0} 

    [{'convert_temperature': {'fahrenheit': 77.0}, '__metadata__': {'cached': True}}] 

    [{'convert_temperature': {'fahrenheit': 96.8}}] 

    [{'convert_temperature': {'fahrenheit': 96.8}}]

    The system fetches the response from the cache on the first repeated request and the TTL is set to 5 seconds. It treats the next repeated request as a new one when the gap exceeds the TTL. We used cache.clear() to clear the entire cache, this is useful when we set ttl=None. 

    Now, let’s implement the caching for node with an agent.

    Prerequisites: Caching for Node with an Agent

    We’ll need a Gemini API key to use Gemini models in the agent, visit Google AI Studio to get your API key: https://aistudio.google.com/api-keys  

    API Keys

    Installations

    The langchain_google_genai module will help us integrate the Gemini models in the node.  

    !pip install langgraph langchain_google_genai 

    Agent definition 

    Let’s define a simple math agent that has access to the calculator tool and we’ll set the ttl=None for now.  

    from langgraph.prebuilt import create_react_agent 
    
    from langchain_google_genai import ChatGoogleGenerativeAI 
    
    def solve_math_problem(expression: str) -> str: 
    
       """Solve a math problem.""" 
    
       try: 
    
           # Evaluate the mathematical expression 
    
           result = eval(expression, {"__builtins__": {}}) 
    
           return f"The answer is {result}." 
    
       except Exception: 
    
           return "I couldn't solve that expression." 
    
    # Initialize the Gemini model with API key 
    
    model = ChatGoogleGenerativeAI( 
    
       model="gemini-2.5-flash", 
    
       google_api_key=GOOGLE_API_KEY 
    
    ) 
    
    # Create the agent 
    
    agent = create_react_agent( 
    
       model=model, 
    
       tools=[solve_math_problem], 
    
       prompt=( 
    
           "You are a Math Tutor AI. " 
    
           "When a user asks a math question, reason through the steps clearly " 
    
           "and use the tool `solve_math_problem` for numeric calculations. " 
    
           "Always explain your reasoning before giving the final answer." 
    
       ), 
    
    )

    Defining the node 

    Next, we’ll wrap the agent inside a LangGraph node and attach caching to it.

    import time 
    
    from typing_extensions import TypedDict 
    
    from langgraph.graph import StateGraph 
    
    from langgraph.cache.memory import InMemoryCache 
    
    from langgraph.types import CachePolicy 
    
    class AgentState(TypedDict): 
    
       prompt: str 
    
       response: str 
    
    builder = StateGraph(AgentState) 
    
    def run_agent(state: AgentState) -> AgentState: 
    
       print("Running agent...")  # this line helps show caching behavior 
    
       response = agent.invoke({"messages": [{"role": "user", "content": state["prompt"]}]}) 
    
       return {"response": response} 
    
    builder.add_node("run_agent", run_agent, cache_policy=CachePolicy(ttl=None)) 
    
    builder.set_entry_point("run_agent") 
    
    builder.set_finish_point("run_agent") 
    
    graph = builder.compile(cache=InMemoryCache())

    Invoking the agent 

    Finally, let’s call the agent twice to see caching in action.

    # Invoke graph twice to see caching 
    
    print("First call") 
    
    result1 = graph.invoke({"prompt": "What is (12 + 8) * 3?"},stream_mode="updates") 
    
    print(result1) 
    
    print("Second call (should be cached)") 
    
    result2 = graph.invoke({"prompt": "What is (12 + 8) * 3?"},stream_mode="updates") 
    
    print(result2)

    Output:

    Output

    Notice how the second call doesn’t have ‘Running agent..’ which is a print statement in the node. So we managed to get the response from the agent without running the agent using the cache memory. 

    Conclusion

    LangGraph’s built-in node-level caching provides a simple yet powerful way to reduce latency and computation by reusing previous results. With parameters like ttl to manage cache lifetime and options such as InMemoryCache, SqliteCache, or RedisCache, it offers flexibility based on use cases. Through examples like temperature conversion to agent-based nodes! We saw how caching avoids redundant execution and saves cost. Overall, caching in LangGraph greatly improves efficiency, making workflows faster and more optimized. 

    Frequently Asked Questions

    Q1. What is key_func in cache policy? 

    A. The key_func parameter defines how LangGraph generates a unique cache key for each node’s input. By default, it uses the node’s input values to create this key. You can override it to customize caching behavior. For example, to ignore specific fields or normalize inputs before comparison.

    Q2. How can I clear or refresh the cache? 

    A. You can manually clear the cache anytime using cache.clear(). This removes all stored node responses, forcing LangGraph to re-execute the nodes on the next call. It’s useful during debugging, when working with dynamic inputs, or when the cached data becomes outdated.

    Q3. Can I set different TTL values for different nodes? 

    A. Yes, each node can have its own CachePolicy with a custom ttl value. This allows you to cache heavy or slow computations longer while keeping frequently changing nodes fresh. Fine-tuning TTL values helps balance performance, accuracy, and memory efficiency in large graphs.

    Mounish V

    Passionate about technology and innovation, a graduate of Vellore Institute of Technology. Currently working as a Data Science Trainee, focusing on Data Science. Deeply interested in Deep Learning and Generative AI, eager to explore cutting-edge techniques to solve complex problems and create impactful solutions.

    Login to continue reading and enjoy expert-curated content.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

    Related Posts

    Simplified management of Amazon MSK with natural language using Kiro CLI and Amazon MSK MCP Server

    December 27, 2025

    Top 10 Questions You Asked About Databricks Clean Rooms, Answered

    December 26, 2025

    Top 10 Free Python Courses with Certificates

    December 23, 2025

    Bridging Wireless and 5G

    December 22, 2025

    Modernize Apache Spark workflows using Spark Connect on Amazon EMR on Amazon EC2

    December 21, 2025

    Welcoming Stately Cloud to Databricks: Investing in the Foundation for Scalable AI Applications

    December 20, 2025
    Top Posts

    Understanding U-Net Architecture in Deep Learning

    November 25, 20258 Views

    Microsoft 365 Copilot now enables you to build apps and workflows

    October 29, 20258 Views

    Here’s the latest company planning for gene-edited babies

    November 2, 20257 Views
    Don't Miss

    Customer experience management (CXM) predictions for 2026: How customers, enterprises, technology, and the provider landscape will evolve 

    December 28, 2025

    After laying out our bold CXM predictions for 2025 and then assessing how those bets played out…

    What to Know About the Cloud and Data Centers in 2026

    December 28, 2025

    Why Enterprise AI Scale Stalls

    December 28, 2025

    New serverless customization in Amazon SageMaker AI accelerates model fine-tuning

    December 28, 2025
    Stay In Touch
    • Facebook
    • Instagram
    About Us

    At GeekFence, we are a team of tech-enthusiasts, industry watchers and content creators who believe that technology isn’t just about gadgets—it’s about how innovation transforms our lives, work and society. We’ve come together to build a place where readers, thinkers and industry insiders can converge to explore what’s next in tech.

    Our Picks

    Customer experience management (CXM) predictions for 2026: How customers, enterprises, technology, and the provider landscape will evolve 

    December 28, 2025

    What to Know About the Cloud and Data Centers in 2026

    December 28, 2025

    Subscribe to Updates

    Please enable JavaScript in your browser to complete this form.
    Loading
    • About Us
    • Contact Us
    • Disclaimer
    • Privacy Policy
    • Terms and Conditions
    © 2025 Geekfence.All Rigt Reserved.

    Type above and press Enter to search. Press Esc to cancel.