meta_mcp.tools._search
======================

.. py:module:: meta_mcp.tools._search


Attributes
----------

.. autoapisummary::

   meta_mcp.tools._search._model_cache


Functions
---------

.. autoapisummary::

   meta_mcp.tools._search.get_tool_info
   meta_mcp.tools._search.list_servers
   meta_mcp.tools._search.list_server_tools
   meta_mcp.tools._search.general_search
   meta_mcp.tools._search._string_match_search
   meta_mcp.tools._search._create_search_output_model
   meta_mcp.tools._search._create_search_system_prompt
   meta_mcp.tools._search._llm_search
   meta_mcp.tools._search._prepare_semantic_candidates
   meta_mcp.tools._search._compute_cosine_similarity_ranking
   meta_mcp.tools._search._semantic_search
   meta_mcp.tools._search._semantic_search_direct
   meta_mcp.tools._search._semantic_search_http


Module Contents
---------------

.. py:function:: get_tool_info(server_name: Annotated[str, The name of the server that provides the tool], tool_name: Annotated[str, The name of the tool to get information about]) -> str
   :async:


   Returns the input schema for a given tool, to know how to call it.


.. py:function:: list_servers(query: Annotated[str | None, Optional query to filter the servers by] = None) -> str
   :async:


   Lists all (or just the filtered) MCP servers and their descriptions, offering a wide range of tools for biomedical (analysis) tasks to choose from.


.. py:function:: list_server_tools(server_name: Annotated[str, The name of the MCP server to list tools for], query: Annotated[str | None, Optional query to filter the tools by] = None) -> str
   :async:


   Returns a list of all (or just the filtered) tools for a given MCP server.


.. py:function:: general_search(query: str, candidates: list[str], top_n: int = 5, mode: Literal['string_match', 'llm', 'semantic'] = 'string_match', string_match_method: Literal['fuzzy', 'substring'] = 'fuzzy', descriptions: list[str | None] | None = None, reasoning: bool = True, **kwargs) -> list[str]

   Returns top-n most fitting strings from candidates list.

   :param query: The search query string.
   :type query: str
   :param candidates: List of candidate strings to search through.
   :type candidates: list[str]
   :param top_n: Number of top results to return (default: 5).
   :type top_n: int
   :param mode: Search mode: "string_match", "llm", or "semantic" (default: "string_match").
   :type mode: str
   :param string_match_method: String matching method: "fuzzy" or "substring" (default: "fuzzy").
   :type string_match_method: str
   :param descriptions: Optional list of descriptions for candidates. Must match length of candidates.
                        Used in both "llm" and "semantic" search modes. In "llm" mode, descriptions
                        are formatted as CSV in the system prompt. In "semantic" mode, descriptions are
                        combined with candidates as "candidate (description)" for embedding computation.
   :type descriptions: list[str | None] | None
   :param \*\*kwargs: Mode-specific keyword arguments. "llm" mode keys: model (str), temperature (float),
                      reasoning (bool). "semantic" mode keys: backend (str), model (str), http_url (str).
   :type \*\*kwargs: dict

   :returns: List of top-n matching strings, ordered by relevance (best match first).
   :rtype: list[str]

   :raises ValueError: If invalid mode or string_match_method is provided, or if descriptions length
       doesn't match candidates length.
   :raises RuntimeError: If LLM search fails (when mode is "llm").


.. py:function:: _string_match_search(query: str, candidates: list[str], top_n: int, method: Literal['fuzzy', 'substring']) -> list[str]

   Perform string matching search using specified method.

   :param query: The search query string.
   :type query: str
   :param candidates: List of candidate strings to search through.
   :type candidates: list[str]
   :param top_n: Number of top results to return.
   :type top_n: int
   :param method: Matching method: "fuzzy" or "substring".
   :type method: str

   :returns: List of top-n matching strings, ordered by relevance.
   :rtype: list[str]


.. py:function:: _create_search_output_model(candidates: list[str], reasoning: bool = True) -> type[pydantic.BaseModel]

   Create a dynamic Pydantic model for LLM search output.

   :param candidates: List of candidate strings to create Literal types from.
   :type candidates: list[str]
   :param reasoning: Whether to include a reasoning field in the output model (default: True).
   :type reasoning: bool

   :returns: Dynamically created Pydantic model with selected_strings field and optionally reasoning field.
   :rtype: type


.. py:function:: _create_search_system_prompt(candidates: list[str], top_n: int, descriptions: list[str | None] | None = None, reasoning: bool = True) -> str

   Create a system prompt template for LLM search.

   :param candidates: List of candidate strings to search through.
   :type candidates: list[str]
   :param top_n: Number of top results to return.
   :type top_n: int
   :param descriptions: Optional list of descriptions for candidates. If provided and not all None,
                        formats candidates and descriptions as CSV table. Otherwise uses numbered list format.
   :type descriptions: list[str | None] | None
   :param reasoning: Whether to include reasoning instructions in the prompt (default: True).
   :type reasoning: bool

   :returns: System prompt string with candidates list and instructions.
   :rtype: str


.. py:function:: _llm_search(query: str, candidates: list[str], top_n: int, model: str = 'openai/gpt-5-nano', temperature: float = 1.0, descriptions: list[str | None] | None = None, reasoning: bool = True, verbose: bool = False, **kwargs) -> list[str]

   Perform LLM-based search using structured outputs.

   :param query: The search query string.
   :type query: str
   :param candidates: List of candidate strings to search through.
   :type candidates: list[str]
   :param top_n: Number of top results to return.
   :type top_n: int
   :param model: Model name for LLM backend (default: "openai/gpt-5-nano").
   :type model: str
   :param temperature: Sampling temperature (default: 1.0).
   :type temperature: float
   :param descriptions: Optional list of descriptions for candidates. When provided, candidates and
                        descriptions are formatted as CSV in the system prompt to help the LLM
                        make better matches. The function still returns only the original candidates.
   :type descriptions: list[str | None] | None
   :param reasoning: Whether to include reasoning in LLM output and prompt (default: True).
   :type reasoning: bool
   :param \*\*kwargs: Additional keyword arguments (unused, reserved for future use).

   :returns: List of top-n matching strings, ordered by relevance (original candidates without descriptions).
   :rtype: list[str]

   :raises ValueError: If the response contains invalid strings or parsing fails.
   :raises RuntimeError: If the LLM call fails.


.. py:data:: _model_cache
   :type:  dict[str, object]

.. py:function:: _prepare_semantic_candidates(candidates: list[str], descriptions: list[str | None] | None) -> tuple[list[str], list[str]]

   Prepare candidates for semantic search by combining with descriptions.

   :param candidates: List of candidate strings.
   :type candidates: list[str]
   :param descriptions: Optional list of descriptions. If None or all None, returns original candidates.
                        If provided, combines as "candidate (description)" when description exists.
   :type descriptions: list[str | None] | None

   :returns: Tuple of (combined_strings_for_embedding, original_candidates).
             The combined strings are used for embedding, and original_candidates
             maintains the mapping back to original strings.
   :rtype: tuple[list[str], list[str]]


.. py:function:: _compute_cosine_similarity_ranking(query_embedding: numpy.ndarray, candidate_embeddings: numpy.ndarray, top_n: int) -> numpy.ndarray

   Compute cosine similarity between query and candidates, return top-n indices.

   :param query_embedding: Query embedding vector.
   :type query_embedding: np.ndarray
   :param candidate_embeddings: Candidate embedding vectors (2D array).
   :type candidate_embeddings: np.ndarray
   :param top_n: Number of top results to return.
   :type top_n: int

   :returns: Indices of top-n candidates ordered by similarity (best first).
   :rtype: np.ndarray


.. py:function:: _semantic_search(query: str, candidates: list[str], top_n: int, backend: Literal['direct', 'http'] = 'direct', model: str = 'all-MiniLM-L6-v2', http_url: str | None = None, descriptions: list[str | None] | None = None, **kwargs) -> list[str]

   Perform semantic search using embeddings.

   :param query: The search query string.
   :type query: str
   :param candidates: List of candidate strings to search through.
   :type candidates: list[str]
   :param top_n: Number of top results to return.
   :type top_n: int
   :param backend: Backend to use: "direct" or "http" (default: "direct").
   :type backend: str
   :param model: Model name for direct backend (default: "all-MiniLM-L6-v2").
   :type model: str
   :param http_url: URL for HTTP backend (default: "http://127.0.0.1:8501/embed"). When unset,
                    falls back to the META_MCP_EMBEDDING_HTTP_URL environment variable.
   :type http_url: str
   :param descriptions: Optional list of descriptions for candidates. Combined with candidates
                        as "candidate (description)" for embedding computation.
   :type descriptions: list[str | None] | None
   :param \*\*kwargs: Additional keyword arguments (unused, reserved for future use).

   :returns: List of top-n matching strings, ordered by relevance (original candidates without descriptions).
   :rtype: list[str]

   :raises ValueError: If invalid backend is provided.


.. py:function:: _semantic_search_direct(query: str, combined_strings: list[str], original_candidates: list[str], top_n: int, model: str) -> list[str]

   Perform semantic search using direct model initialization.

   :param query: The search query string.
   :type query: str
   :param combined_strings: List of combined candidate strings (with descriptions if provided) for embedding.
   :type combined_strings: list[str]
   :param original_candidates: List of original candidate strings to return.
   :type original_candidates: list[str]
   :param top_n: Number of top results to return.
   :type top_n: int
   :param model: Model name to use for embeddings.
   :type model: str

   :returns: List of top-n matching strings, ordered by relevance (original candidates).
   :rtype: list[str]


.. py:function:: _semantic_search_http(query: str, combined_strings: list[str], original_candidates: list[str], top_n: int, http_url: str) -> list[str]

   Perform semantic search using HTTP embedding server.

   :param query: The search query string.
   :type query: str
   :param combined_strings: List of combined candidate strings (with descriptions if provided) for embedding.
   :type combined_strings: list[str]
   :param original_candidates: List of original candidate strings to return.
   :type original_candidates: list[str]
   :param top_n: Number of top results to return.
   :type top_n: int
   :param http_url: URL of the embedding server endpoint.
   :type http_url: str

   :returns: List of top-n matching strings, ordered by relevance (original candidates).
   :rtype: list[str]

   :raises RuntimeError: If the HTTP request fails.
   :raises ValueError: If the server response is invalid.


