> ## Documentation Index
> Fetch the complete documentation index at: https://doc.lucidworks.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Machine Learning

export const LwTemplate = ({title = "Key questions to get you started", icon = "sparkles", cta = "Powered by Agent Studio", linkHref = "https://lucidworks.com/demo/?utm_source=docs&utm_medium=referral&utm_campaign=docs_cta_ai"}) => {
  const [isLoaded, setIsLoaded] = useState(false);
  useEffect(() => {
    const timer = setTimeout(() => {
      setIsLoaded(true);
    }, 500);
    return () => clearTimeout(timer);
  }, []);
  return <div className="lw-template-container">
      <Card title={title} icon={icon}>
        {isLoaded && <span dangerouslySetInnerHTML={{
    __html: `<lw-template id="a029c1a9-28be-427e-b0e1-5d918920246a"></lw-template
            >`
  }} />}
        <Link href={linkHref} className="agent-studio-link text-left text-gray-600 gap-2 dark:text-gray-400 text-sm font-medium flex flex-row items-center hover:text-primary dark:hover:text-primary-light group-hover:text-primary group-hover:dark:text-primary-light">Powered by Lucidworks Agent Studio</Link>
      </Card>
    </div>;
};

[localhost link]: http://localhost:3000/docs/4/fusion-ai/concepts/machine-learning/overview

[mintlify link]: https://doc.lucidworks.com/docs/4/fusion-ai/concepts/machine-learning/overview

[old doc.lw link]: https://doc.lucidworks.com/fusion/5.9/495

See also these subtopics:

* [Machine Learning Models in Fusion](/docs/4/fusion-ai/concepts/machine-learning/machine-learning-models)
* [Machine Learning Jobs](/docs/4/fusion-ai/concepts/machine-learning/ml-jobs)

[Apache Spark](http://spark.apache.org/) is an open source cluster-computing framework that serves as a fast and general execution engine for large-scale data processing jobs that can be decomposed into stepwise tasks, which are distributed across a cluster of networked computers.

Spark improves on previous MapReduce implementations by using resilient distributed datasets (RDDs), a distributed memory abstraction that lets programmers perform in-memory computations on large clusters in a fault-tolerant manner.

Fusion manages a Spark cluster that is used for all [signal aggregation](/docs/4/fusion-ai/concepts/signals-and-aggregations/aggregations/overview) processes.

With a Fusion AI license, you can also use the Spark cluster to [train and compile machine learning models](/docs/4/fusion-ai/concepts/machine-learning/machine-learning-models), as well as to run experiments via the [Fusion UI](/docs/4/fusion-ai/concepts/experiments/overview) or the [Spark Jobs API](/docs/4/fusion-server/reference/api/spark-jobs-api).

See [Machine Learning Jobs](/docs/4/fusion-ai/concepts/machine-learning/ml-jobs) for details about each pre-defined machine learning job in Fusion.

To schedule and run jobs on the nodes in the cluster, Spark uses [Akka](https://en.wikipedia.org/wiki/Akka_\(toolkit\))
which is a toolkit and runtime for building highly concurrent,
distributed, and resilient message-driven applications on the JVM.

<LwTemplate />

## Further reading

* [Machine Learning in Lucidworks Fusion](https://lucidworks.com/post/machine-learning-in-lucidworks-fusion/)
* [Apache Spark Key Terms, Explained](https://databricks.com/blog/2016/06/22/apache-spark-key-terms-explained.html)
* [Apache Spark on Wikipedia](https://en.wikipedia.org/wiki/Apache_Spark)
