{"cells": [{"cell_type": "markdown", "id": "1b9d02c0", "metadata": {}, "source": ["# Use of PYBIND11_MAKE_OPAQUE\n", "\n", "[pybind11](https://pybind11.readthedocs.io/) automatically converts `std::vector` into python list. That's convenient but not necessarily efficient depending on how it is used after that. [PYBIND11_MAKE_OPAQUE](https://pybind11.readthedocs.io/en/stable/advanced/cast/stl.html#making-opaque-types) is used to create a [capsule](https://docs.python.org/3/c-api/capsule.html) to hold a pointer on the C++ object."]}, {"cell_type": "code", "execution_count": 1, "id": "f4a5179a", "metadata": {}, "outputs": [{"data": {"text/html": ["
\n", ""], "text/plain": [""]}, "execution_count": 2, "metadata": {}, "output_type": "execute_result"}], "source": ["from jyquickhelper import add_notebook_menu\n", "add_notebook_menu()"]}, {"cell_type": "code", "execution_count": 2, "id": "4f747872", "metadata": {}, "outputs": [], "source": ["%matplotlib inline"]}, {"cell_type": "markdown", "id": "aaf98a54", "metadata": {}, "source": ["## Two identical classes\n", "\n", "Both of then creates random vectors equivalent to `std::vector`, and `Tensor ~ std::vector`. The first one returns a capsule due `PYBIND11_MAKE_OPAQUE(std::vector)` inserted into the C++ code. The other one is returning a list."]}, {"cell_type": "code", "execution_count": 3, "id": "68b5a870", "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["\n"]}], "source": ["from cpyquickhelper.examples.vector_container_python import (\n", " RandomTensorVectorFloat, RandomTensorVectorFloat2)\n", "\n", "rnd = RandomTensorVectorFloat(10, 10)\n", "result = rnd.get_tensor_vector()\n", "print(result)"]}, {"cell_type": "code", "execution_count": 4, "id": "9e696e48", "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["\n"]}], "source": ["result_ref = rnd.get_tensor_vector_ref()\n", "print(result_ref)"]}, {"cell_type": "code", "execution_count": 5, "id": "ffa90177", "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["[, , , , , , , , , ]\n"]}], "source": ["rnd2 = RandomTensorVectorFloat2(10, 10)\n", "result2 = rnd2.get_tensor_vector()\n", "print(result2)"]}, {"cell_type": "code", "execution_count": 6, "id": "b1c8b697", "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["\n"]}], "source": ["result2_ref = rnd2.get_tensor_vector_ref()\n", "print(result2_ref)"]}, {"cell_type": "code", "execution_count": 7, "id": "b51163b3", "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["3.13 \u00b5s \u00b1 60.6 ns per loop (mean \u00b1 std. dev. of 7 runs, 100,000 loops each)\n"]}], "source": ["%timeit rnd.get_tensor_vector()"]}, {"cell_type": "code", "execution_count": 8, "id": "8de22452", "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["1.21 \u00b5s \u00b1 38.4 ns per loop (mean \u00b1 std. dev. of 7 runs, 1,000,000 loops each)\n"]}], "source": ["%timeit rnd.get_tensor_vector_ref()"]}, {"cell_type": "code", "execution_count": 9, "id": "4466169b", "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["10.4 \u00b5s \u00b1 138 ns per loop (mean \u00b1 std. dev. of 7 runs, 100,000 loops each)\n"]}], "source": ["%timeit rnd2.get_tensor_vector()"]}, {"cell_type": "code", "execution_count": 10, "id": "c716c766", "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["1.19 \u00b5s \u00b1 18.1 ns per loop (mean \u00b1 std. dev. of 7 runs, 1,000,000 loops each)\n"]}], "source": ["%timeit rnd2.get_tensor_vector_ref()"]}, {"cell_type": "markdown", "id": "e3cafd4d", "metadata": {}, "source": ["## Scenarii\n", "\n", "Three possibilities:\n", "\n", "* **list**: `std::vector` is converted into a list of copied Tensors\n", "* **capsule**: `std::vector` is converted into a capsule on a copied `std::vector`, the capsule still holds the pointer and is responsible to the deletion.\n", "* **ref**: `std::vector` is just return as a pointer. The cost of getting the pointer does not depend on the content size. It is somehow the low limit."]}, {"cell_type": "markdown", "id": "f2ed080e", "metadata": {}, "source": ["## Plots"]}, {"cell_type": "code", "execution_count": 11, "id": "7cd2b82d", "metadata": {}, "outputs": [{"name": "stderr", "output_type": "stream", "text": ["100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 144/144 [00:19<00:00, 7.43it/s]\n"]}, {"data": {"text/html": ["\n", "\n", "
\n", " \n", " \n", " \n", " average \n", " deviation \n", " min_exec \n", " max_exec \n", " repeat \n", " number \n", " ttime \n", " context_size \n", " name \n", " n_vectors \n", " size \n", " \n", " \n", " \n", " \n", " 409 \n", " 0.018321 \n", " 0.000367 \n", " 0.018026 \n", " 0.018838 \n", " 3 \n", " 3 \n", " 0.054962 \n", " 64 \n", " list \n", " 10000 \n", " 200 \n", " \n", " \n", " 410 \n", " 0.000002 \n", " 0.000001 \n", " 0.000001 \n", " 0.000004 \n", " 3 \n", " 3 \n", " 0.000007 \n", " 64 \n", " ref \n", " 10000 \n", " 200 \n", " \n", " \n", " 411 \n", " 0.010974 \n", " 0.000492 \n", " 0.010512 \n", " 0.011656 \n", " 3 \n", " 3 \n", " 0.032923 \n", " 64 \n", " capsule \n", " 10000 \n", " 500 \n", " \n", " \n", " 412 \n", " 0.035484 \n", " 0.000900 \n", " 0.034286 \n", " 0.036456 \n", " 3 \n", " 3 \n", " 0.106451 \n", " 64 \n", " list \n", " 10000 \n", " 500 \n", " \n", " \n", " 413 \n", " 0.000003 \n", " 0.000002 \n", " 0.000001 \n", " 0.000005 \n", " 3 \n", " 3 \n", " 0.000008 \n", " 64 \n", " ref \n", " 10000 \n", " 500 \n", " \n", " \n", "
\n", "
"], "text/plain": [" average deviation min_exec max_exec repeat number ttime \\\n", "409 0.018321 0.000367 0.018026 0.018838 3 3 0.054962 \n", "410 0.000002 0.000001 0.000001 0.000004 3 3 0.000007 \n", "411 0.010974 0.000492 0.010512 0.011656 3 3 0.032923 \n", "412 0.035484 0.000900 0.034286 0.036456 3 3 0.106451 \n", "413 0.000003 0.000002 0.000001 0.000005 3 3 0.000008 \n", "\n", " context_size name n_vectors size \n", "409 64 list 10000 200 \n", "410 64 ref 10000 200 \n", "411 64 capsule 10000 500 \n", "412 64 list 10000 500 \n", "413 64 ref 10000 500 "]}, "execution_count": 12, "metadata": {}, "output_type": "execute_result"}], "source": ["import itertools\n", "from cpyquickhelper.numbers.speed_measure import measure_time\n", "from tqdm import tqdm\n", "import pandas\n", "\n", "data = []\n", "sizes = [1, 2, 5, 10, 20, 50, 100, 200, 500, 1000, 5000, 10000]\n", "sizes = list(itertools.product(sizes, sizes))\n", "for i, j in tqdm(sizes):\n", " if j >= 1000:\n", " if i > 1000:\n", " continue\n", " if i * j >= 1e6:\n", " repeat, number = 3, 3\n", " else:\n", " repeat, number = 10, 10\n", " rnd = RandomTensorVectorFloat(i, j)\n", " obs = measure_time(lambda: rnd.get_tensor_vector(), repeat=repeat, number=number, div_by_number=True)\n", " obs['name'] = 'capsule'\n", " obs['n_vectors'] = i\n", " obs['size'] = j\n", " data.append(obs)\n", "\n", " rnd2 = RandomTensorVectorFloat2(i, j)\n", " obs = measure_time(lambda: rnd2.get_tensor_vector(), repeat=repeat, number=number, div_by_number=True)\n", " obs['name'] = 'list'\n", " obs['n_vectors'] = i\n", " obs['size'] = j\n", " data.append(obs)\n", "\n", " obs = measure_time(lambda: rnd2.get_tensor_vector_ref(), repeat=repeat, number=number, div_by_number=True)\n", " obs['name'] = 'ref'\n", " obs['n_vectors'] = i\n", " obs['size'] = j\n", " data.append(obs)\n", " \n", "df = pandas.DataFrame(data)\n", "df.tail()"]}, {"cell_type": "code", "execution_count": 12, "id": "e66b1106", "metadata": {}, "outputs": [{"data": {"text/html": ["\n", "\n", "
\n", " \n", " \n", " \n", " name \n", " capsule \n", " list \n", " ref \n", " ratio \n", " \n", " \n", " n_vectors \n", " size \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " 10000 \n", " 20 \n", " 0.001611 \n", " 0.009616 \n", " 0.000001 \n", " 0.167512 \n", " \n", " \n", " 50 \n", " 0.001892 \n", " 0.010399 \n", " 0.000001 \n", " 0.181907 \n", " \n", " \n", " 100 \n", " 0.002597 \n", " 0.014054 \n", " 0.000002 \n", " 0.184789 \n", " \n", " \n", " 200 \n", " 0.004847 \n", " 0.018321 \n", " 0.000002 \n", " 0.264565 \n", " \n", " \n", " 500 \n", " 0.010974 \n", " 0.035484 \n", " 0.000003 \n", " 0.309278 \n", " \n", " \n", "
\n", "
"], "text/plain": ["name capsule list ref ratio\n", "n_vectors size \n", "10000 20 0.001611 0.009616 0.000001 0.167512\n", " 50 0.001892 0.010399 0.000001 0.181907\n", " 100 0.002597 0.014054 0.000002 0.184789\n", " 200 0.004847 0.018321 0.000002 0.264565\n", " 500 0.010974 0.035484 0.000003 0.309278"]}, "execution_count": 13, "metadata": {}, "output_type": "execute_result"}], "source": ["piv = pandas.pivot_table(df, index=['n_vectors', 'size'], columns=['name'], values='average')\n", "piv['ratio'] = piv['capsule'] / piv['list']\n", "piv.tail()"]}, {"cell_type": "code", "execution_count": 13, "id": "23ccccc5", "metadata": {}, "outputs": [{"data": {"image/png": "\n", "text/plain": [""]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["import matplotlib.pyplot as plt\n", "fig, ax = plt.subplots(1, 2, figsize=(10, 4))\n", "piv[['capsule', 'list', 'ref']].plot(logy=True, ax=ax[0], title='Capsule (OPAQUE) / list')\n", "piv.sort_values('ratio', ascending=False)[['ratio']].plot(ax=ax[1], title='Ratio Capsule (OPAQUE) / list');"]}, {"cell_type": "code", "execution_count": 14, "id": "2fde92f9", "metadata": {}, "outputs": [{"data": {"text/html": ["\n", "\n", "
\n", " \n", " \n", " size \n", " 1 \n", " 2 \n", " 5 \n", " 10 \n", " 20 \n", " 50 \n", " 100 \n", " 200 \n", " 500 \n", " 1000 \n", " 5000 \n", " 10000 \n", " \n", " \n", " n_vectors \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " 1 \n", " 0.566917 \n", " 1.130723 \n", " 0.995022 \n", " 1.023320 \n", " 1.028793 \n", " 1.612390 \n", " 0.943035 \n", " 1.717130 \n", " 1.284653 \n", " 0.973898 \n", " 1.083473 \n", " 0.663566 \n", " \n", " \n", " 2 \n", " 0.862783 \n", " 0.757669 \n", " 0.775862 \n", " 0.775123 \n", " 0.829735 \n", " 0.763393 \n", " 0.776532 \n", " 0.762319 \n", " 0.871343 \n", " 0.699622 \n", " 0.612026 \n", " 0.539308 \n", " \n", " \n", " 5 \n", " 0.379608 \n", " 0.555293 \n", " 0.368967 \n", " 0.470159 \n", " 0.378897 \n", " 0.410572 \n", " 0.589214 \n", " 0.434686 \n", " 0.395905 \n", " 0.419958 \n", " 0.461235 \n", " 0.473515 \n", " \n", " \n", " 10 \n", " 0.301479 \n", " 0.316046 \n", " 0.374528 \n", " 0.303198 \n", " 0.407169 \n", " 0.359681 \n", " 0.288451 \n", " 0.360526 \n", " 0.402298 \n", " 0.398315 \n", " 0.393545 \n", " 0.439519 \n", " \n", " \n", " 20 \n", " 0.356484 \n", " 0.251247 \n", " 0.230067 \n", " 0.210293 \n", " 0.252366 \n", " 0.255456 \n", " 0.288205 \n", " 0.284486 \n", " 0.252392 \n", " 0.257864 \n", " 0.416393 \n", " 0.043764 \n", " \n", " \n", " 50 \n", " 0.231620 \n", " 0.220888 \n", " 0.210009 \n", " 0.216061 \n", " 0.214163 \n", " 0.282138 \n", " 0.266862 \n", " 0.206823 \n", " 0.263674 \n", " 0.339590 \n", " 0.039424 \n", " 0.493669 \n", " \n", " \n", " 100 \n", " 0.394659 \n", " 0.189204 \n", " 0.256503 \n", " 0.206615 \n", " 0.214958 \n", " 0.218592 \n", " 0.240267 \n", " 0.225569 \n", " 0.243605 \n", " 0.314168 \n", " 0.412960 \n", " 0.406809 \n", " \n", " \n", " 200 \n", " 0.201795 \n", " 0.189723 \n", " 0.211110 \n", " 0.197427 \n", " 0.212577 \n", " 0.212658 \n", " 0.212798 \n", " 0.225500 \n", " 0.257274 \n", " 0.277588 \n", " 0.439891 \n", " 0.427631 \n", " \n", " \n", " 500 \n", " 0.189678 \n", " 0.169699 \n", " 0.188959 \n", " 0.193446 \n", " 0.184101 \n", " 0.209909 \n", " 0.205913 \n", " 0.229271 \n", " 0.243023 \n", " 0.110660 \n", " 0.472577 \n", " 0.456907 \n", " \n", " \n", " 1000 \n", " 0.181940 \n", " 0.183248 \n", " 0.183335 \n", " 0.186856 \n", " 0.191277 \n", " 0.195801 \n", " 0.197123 \n", " 0.209983 \n", " 0.262004 \n", " 0.184530 \n", " 0.471995 \n", " 0.467629 \n", " \n", " \n", " 5000 \n", " 0.183440 \n", " 0.179145 \n", " 0.178354 \n", " 0.179633 \n", " 0.178630 \n", " 0.185844 \n", " 0.194558 \n", " 0.160261 \n", " 0.334000 \n", " NaN \n", " NaN \n", " NaN \n", " \n", " \n", " 10000 \n", " 0.178259 \n", " 0.174690 \n", " 0.172229 \n", " 0.166271 \n", " 0.167512 \n", " 0.181907 \n", " 0.184789 \n", " 0.264565 \n", " 0.309278 \n", " NaN \n", " NaN \n", " NaN \n", " \n", " \n", "
\n", "
"], "text/plain": ["size 1 2 5 10 20 50 \\\n", "n_vectors \n", "1 0.566917 1.130723 0.995022 1.023320 1.028793 1.612390 \n", "2 0.862783 0.757669 0.775862 0.775123 0.829735 0.763393 \n", "5 0.379608 0.555293 0.368967 0.470159 0.378897 0.410572 \n", "10 0.301479 0.316046 0.374528 0.303198 0.407169 0.359681 \n", "20 0.356484 0.251247 0.230067 0.210293 0.252366 0.255456 \n", "50 0.231620 0.220888 0.210009 0.216061 0.214163 0.282138 \n", "100 0.394659 0.189204 0.256503 0.206615 0.214958 0.218592 \n", "200 0.201795 0.189723 0.211110 0.197427 0.212577 0.212658 \n", "500 0.189678 0.169699 0.188959 0.193446 0.184101 0.209909 \n", "1000 0.181940 0.183248 0.183335 0.186856 0.191277 0.195801 \n", "5000 0.183440 0.179145 0.178354 0.179633 0.178630 0.185844 \n", "10000 0.178259 0.174690 0.172229 0.166271 0.167512 0.181907 \n", "\n", "size 100 200 500 1000 5000 10000 \n", "n_vectors \n", "1 0.943035 1.717130 1.284653 0.973898 1.083473 0.663566 \n", "2 0.776532 0.762319 0.871343 0.699622 0.612026 0.539308 \n", "5 0.589214 0.434686 0.395905 0.419958 0.461235 0.473515 \n", "10 0.288451 0.360526 0.402298 0.398315 0.393545 0.439519 \n", "20 0.288205 0.284486 0.252392 0.257864 0.416393 0.043764 \n", "50 0.266862 0.206823 0.263674 0.339590 0.039424 0.493669 \n", "100 0.240267 0.225569 0.243605 0.314168 0.412960 0.406809 \n", "200 0.212798 0.225500 0.257274 0.277588 0.439891 0.427631 \n", "500 0.205913 0.229271 0.243023 0.110660 0.472577 0.456907 \n", "1000 0.197123 0.209983 0.262004 0.184530 0.471995 0.467629 \n", "5000 0.194558 0.160261 0.334000 NaN NaN NaN \n", "10000 0.184789 0.264565 0.309278 NaN NaN NaN "]}, "execution_count": 15, "metadata": {}, "output_type": "execute_result"}], "source": ["flat = piv.reset_index(drop=False)[['n_vectors', 'size', 'ratio']]\n", "flat_piv = flat.pivot('n_vectors', 'size', 'ratio')\n", "flat_piv"]}, {"cell_type": "code", "execution_count": 15, "id": "b69db209", "metadata": {}, "outputs": [{"data": {"image/png": "\n", "text/plain": [""]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["import numpy\n", "import seaborn\n", "seaborn.heatmap(numpy.minimum(flat_piv.values, 1), cmap=\"YlGnBu\",\n", " xticklabels=list(flat_piv.index), yticklabels=list(flat_piv.columns));"]}, {"cell_type": "code", "execution_count": 16, "id": "7449de78", "metadata": {}, "outputs": [], "source": []}, {"cell_type": "code", "execution_count": 17, "id": "45756699", "metadata": {}, "outputs": [], "source": []}], "metadata": {"kernelspec": {"display_name": "Python 3", "language": "python", "name": "python3"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.5"}}, "nbformat": 4, "nbformat_minor": 5}