.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "examples/add.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_examples_add.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_examples_add.py:


Element-wise Addition Example
===========================

This example demonstrates how to implement an element-wise addition kernel using Helion.

.. GENERATED FROM PYTHON SOURCE LINES 9-11

Imports
-------

.. GENERATED FROM PYTHON SOURCE LINES 11-20

.. code-block:: Python

    from __future__ import annotations

    import torch

    import helion
    from helion._testing import run_example
    import helion.language as hl


.. GENERATED FROM PYTHON SOURCE LINES 21-23

Addition Kernel
--------------

.. GENERATED FROM PYTHON SOURCE LINES 23-49

.. code-block:: Python

    @helion.kernel()
    def add(x: torch.Tensor, y: torch.Tensor) -> torch.Tensor:
        """
        Add two tensors element-wise with broadcasting support.

        Args:
            x: First input tensor
            y: Second input tensor

        Returns:
            A new tensor containing the element-wise sum of x and y
        """
        # match pytorch broadcasting rules
        x, y = torch.broadcast_tensors(x, y)
        out = torch.empty(
            x.shape,
            # match type promotion of torch.add
            dtype=torch.promote_types(x.dtype, y.dtype),
            device=x.device,
        )
        # tile will be a tuple of blocks
        for tile in hl.tile(out.size()):
            out[tile] = x[tile] + y[tile]
        return out


.. GENERATED FROM PYTHON SOURCE LINES 50-52

Verification Function
-------------------

.. GENERATED FROM PYTHON SOURCE LINES 52-65

.. code-block:: Python

    def check(m: int, n: int) -> None:
        """
        Verify the add kernel implementation against PyTorch's native add function.

        Args:
            m: First dimension of the test tensors
            n: Second dimension of the test tensors
        """
        x = torch.randn([m, n], device="cuda", dtype=torch.float16)
        y = torch.randn([m, n], device="cuda", dtype=torch.float16)
        run_example(add, torch.add, (x, y))


.. GENERATED FROM PYTHON SOURCE LINES 66-68

Main Function
-----------

.. GENERATED FROM PYTHON SOURCE LINES 68-77

.. code-block:: Python

    def main() -> None:
        """
        Main entry point that runs the add kernel verification with 1024x1024 tensors.
        """
        check(1024, 1024)


    if __name__ == "__main__":
        main()


.. _sphx_glr_download_examples_add.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: add.ipynb <add.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: add.py <add.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: add.zip <add.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_