Giter Club home page Giter Club logo

Comments (10)

yehoshuadimarsky avatar yehoshuadimarsky commented on July 30, 2024

Can you post your complete code example that fails?

from python-ssas.

nisharncslabs avatar nisharncslabs commented on July 30, 2024

Sure, so I have 2 files in my editor (ssas_api.py copy exactly same as in repo)

  1. Quick start script with code to be run to get dataframe back from DAX query ping to AAS server.
  2. ssas_api.py copy file containing functions which can connect to AAS.

These are both shown down below.

My main question is: I am not too sure what additional things I need to do along with having these files in order to run/get the DAX query data successfully from a particular AAS server.

  • I have confirmed the location of both the prerequisite dll files as being given in the python snippet in the Quickstart file below, not sure if I have to change this in the ssas_api.py file where it says root.
  • I have noticed that in the ssas_api.py file the import System command and subsequent ones below are greyed out as if they arent found. Do I need to pip install something inorder for these to work?

File 1: QuickStart

import ssas_api
import clr

'''
# Python Snippet showing where the required dll files are located

base = "C:/Program Files/PackageManagement/NuGet/Packages/Microsoft.AnalysisServices"
_version = "19.20.1.0"  # at time of this writing
AMO_PATH = f"{base}.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.Tabular.dll"
ADOMD_PATH = f"{base}.AdomdClient.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.AdomdClient.dll"
'''

#_load_assemblies(amo_path=amo_path, adomd_path=adomd_path)


conn = ssas_api.set_conn_string(
    server='<aas_servername>',
    db_name='<db_name>',
    username='<email>',
    password='<password>'
)



dax_string = '''  
MY DAX QUERY
'''

df = ssas_api.get_DAX(connection_string=conn, dax_string=dax_string)

File 2: ssas_api.py script for AAS connection

# -*- coding: utf-8 -*-
"""
Created on Wed Sep 20 16:59:43 2017

@author: Yehoshua
"""

import pandas as pd
import numpy as np
from functools import wraps
from pathlib import Path
import logging
import warnings

logger = logging.getLogger(__name__)

try:
    import clr  # name for pythonnet
except ImportError:
    msg = """
    Could not import 'clr', install the 'pythonnet' library. 
    For conda, `conda install -c pythonnet pythonnet`
    """
    raise ImportError(msg)


def _load_assemblies(amo_path=None, adomd_path=None):
    """
    Loads required assemblies, called after function definition.
    Might need to install SSAS client libraries:
    https://docs.microsoft.com/en-us/azure/analysis-services/analysis-services-data-providers

    Parameters
    ----------
    amo_path : str, default None
        The full path to the DLL file of the assembly for AMO.
        Should end with '**Microsoft.AnalysisServices.Tabular.dll**'
        Example: C:/my/path/to/Microsoft.AnalysisServices.Tabular.dll
        If None, will use the default location on Windows.
    adomd_path : str, default None
        The full path to the DLL file of the assembly for ADOMD.
        Should end with '**Microsoft.AnalysisServices.AdomdClient.dll**'
        Example: C:/my/path/to/Microsoft.AnalysisServices.AdomdClient.dll
        If None, will use the default location on Windows.
    """
    # Full path of .dll files
    root = Path(r"C:\Windows\Microsoft.NET\assembly\GAC_MSIL")
    # get latest version of libraries if multiple libraries are installed (max func)
    if amo_path is None:
        amo_path = str(
            max((root / "Microsoft.AnalysisServices.Tabular").iterdir())
            / "Microsoft.AnalysisServices.Tabular.dll"
        )
    if adomd_path is None:
        adomd_path = str(
            max((root / "Microsoft.AnalysisServices.AdomdClient").iterdir())
            / "Microsoft.AnalysisServices.AdomdClient.dll"
        )

    # load .Net assemblies
    logger.info("Loading .Net assemblies...")
    clr.AddReference("System")
    clr.AddReference("System.Data")
    clr.AddReference(amo_path)
    clr.AddReference(adomd_path)

    # Only after loaded .Net assemblies
    global System, DataTable, AMO, ADOMD

    **import System
    from System.Data import DataTable
    import Microsoft.AnalysisServices.Tabular as AMO
    import Microsoft.AnalysisServices.AdomdClient as ADOMD**

    logger.info("Successfully loaded these .Net assemblies: ")
    for a in clr.ListAssemblies(True):
        logger.info(a.split(",")[ 0 ])


def _assert_dotnet_loaded(func):
    """
    Wrapper to make sure that required .NET assemblies have been loaded and imported.
    Can pass the keyword arguments 'amo_path' and 'adomd_path' to any annotated function,
    it will use them in the `_load_assemblies` function.

    Example:
        .. code-block:: python

            import ssas_api
            conn = ssas_api.set_conn_string(
                's', 'd', 'u', 'p',
                amo_path='C:/path/number/one',
                adomd_path='C:/path/number/two'
            )
    """

    @wraps(func)
    def wrapper(*args, **kwargs):
        amo_path = kwargs.pop("amo_path", None)
        adomd_path = kwargs.pop("adomd_path", None)
        try:
            type(DataTable)
        except NameError:
            # .NET assemblies not loaded/imported
            logger.warning(".Net assemblies not loaded and imported, doing so now...")
            _load_assemblies(amo_path=amo_path, adomd_path=adomd_path)
        return func(*args, **kwargs)

    return wrapper


@_assert_dotnet_loaded
def set_conn_string(server, db_name, username, password):
    """
    Sets connection string to SSAS database,
    in this case designed for Azure Analysis Services
    """
    conn_string = (
        "Provider=MSOLAP;Data Source={};Initial Catalog={};User ID={};"
        "Password={};Persist Security Info=True;Impersonation Level=Impersonate".format(
            server, db_name, username, password
        )
    )
    return conn_string


@_assert_dotnet_loaded
def get_DAX(connection_string, dax_string):
    """
    Executes DAX query and returns the results as a pandas DataFrame

    Parameters
    ---------------
    connection_string : string
        Valid SSAS connection string, use the set_conn_string() method to set
    dax_string : string
        Valid DAX query, beginning with EVALUATE or VAR or DEFINE

    Returns
    ----------------
    pandas DataFrame with the results
    """
    table = _get_DAX(connection_string, dax_string)
    df = _parse_DAX_result(table)
    return df


def _get_DAX(connection_string, dax_string) -> "DataTable":
    dataadapter = ADOMD.AdomdDataAdapter(dax_string, connection_string)
    table = DataTable()
    logger.info("Getting DAX query...")
    dataadapter.Fill(table)
    logger.info("DAX query successfully retrieved")
    return table


def _parse_DAX_result(table: "DataTable") -> pd.DataFrame:
    cols = [ c for c in table.Columns.List ]
    rows = [ ]
    # much better performance to just access data by position instead of name
    # and then add column names afterwards
    for r in range(table.Rows.Count):
        row = [ table.Rows[ r ][ c ] for c in cols ]
        rows.append(row)

    df = pd.DataFrame.from_records(rows, columns=[ c.ColumnName for c in cols ])

    # replace System.DBNull with None
    # df.replace({System.DBNull: np.NaN}) doesn't work for some reason
    df = df.applymap(lambda x: np.NaN if isinstance(x, System.DBNull) else x)

    # convert datetimes
    dt_types = [ c.ColumnName for c in cols if c.DataType.FullName == "System.DateTime" ]
    if dt_types:
        for dtt in dt_types:
            # if all nulls, then pd.to_datetime will fail
            if not df.loc[ :, dtt ].isna().all():
                # https://docs.microsoft.com/en-us/dotnet/standard/base-types/standard-date-and-time-format-strings#Sortable
                ser = df.loc[ :, dtt ].map(lambda x: x.ToString('s'))
                df.loc[ :, dtt ] = pd.to_datetime(ser)

    # convert other types
    types_map = {"System.Int64": int, "System.Double": float, "System.String": str}
    col_types = {c.ColumnName: types_map.get(c.DataType.FullName, "object") for c in cols}

    # handle NaNs (which are floats, as of pandas v.0.25.3) in int columns
    col_types_ints = {k for k, v in col_types.items() if v == int}
    ser = df.isna().any(axis=0)
    col_types.update({k: float for k in set(ser[ ser ].index).intersection(col_types_ints)})

    # convert
    df = df.astype(col_types)

    return df


@_assert_dotnet_loaded
def process_database(connection_string, refresh_type, db_name):
    process_model(
        connection_string=connection_string,
        item_type="model",
        refresh_type=refresh_type,
        db_name=db_name,
    )


@_assert_dotnet_loaded
def process_table(connection_string, table_name, refresh_type, db_name):
    process_model(
        connection_string=connection_string,
        item_type="table",
        item=table_name,
        refresh_type=refresh_type,
        db_name=db_name,
    )


@_assert_dotnet_loaded
def process_model(connection_string, db_name, refresh_type="full", item_type="model", item=None):
    """
    Processes SSAS data model to get new data from underlying source.

    Parameters
    -------------
    connection_string : string
        Valid SSAS connection string, use the set_conn_string() method to set
    db_name : string
        The data model on the SSAS server to process
    refresh_type : string, default `full`
        Type of refresh to process. Currently only supports `full`.
    item_type : string, choice of {'model','table'}, default 'model'
    item : string, optional.
        Then name of the item. Only needed when item_type is 'table', to specify the table name
    """
    assert item_type.lower() in ("table", "model"), f"Invalid item type: {item_type}"
    if item_type.lower() == "table" and not item:
        raise ValueError("If item_type is table, must supply an item (a table name) to process")

    # connect to the AS instance from Python
    AMOServer = AMO.Server()
    logger.info("Connecting to database...")
    AMOServer.Connect(connection_string)

    # Dict of refresh types
    refresh_dict = {"full": AMO.RefreshType.Full}

    # process
    db = AMOServer.Databases[ db_name ]

    if item_type.lower() == "table":
        table = db.Model.Tables.Find(item)
        table.RequestRefresh(refresh_dict[ refresh_type ])
    else:
        db.Model.RequestRefresh(refresh_dict[ refresh_type ])

    op_result = db.Model.SaveChanges()
    if op_result.Impact.IsEmpty:
        logger.info("No objects affected by the refresh")

    logger.info("Disconnecting from Database...")
    # Disconnect
    AMOServer.Disconnect()

from python-ssas.

yehoshuadimarsky avatar yehoshuadimarsky commented on July 30, 2024

No you do not need to install System or any of the .Net imports, those are .Net libraries that we are loading through Python using Pythonnet. This should work as written. Is it working?

from python-ssas.

nisharncslabs avatar nisharncslabs commented on July 30, 2024

No it doesn't seem to be working. The errors I find when I try running my Quickstart file (as stated above) are:

  1. NameError: name 'DataTable' is not defined: Which I think is caused by the for line of code in the ssas_api.py script not being detected.
  2. During handling of the above exception, another exception occurred: FileNotFoundError: [WinError 3] The system cannot find the path specified: 'C:\\Windows\\Microsoft.NET\\assembly\\GAC_MSIL\\Microsoft.AnalysisServices.Tabular' : Which I think is the result of the .dll files not being in the location required/ assumed by your ssas_api.py script? As a result might need to change the .dll file location in the script using the python snippet provided? I might be wrong about this?

I just want to ensure that if we take the Quickstart & ssas_api.py (as is) into the same directory and run the Quickstart it should work as is? I am not entirely sure if I need to change anything or am doing something wrong. I just want to be able send a DAX query against AAS and receive a response and store the result in a dataframe. Not sure if you are able to provide further guidance on how to do this?

from python-ssas.

nisharncslabs avatar nisharncslabs commented on July 30, 2024

I did also follow the steps in the README.md "Getting The Required .Net Libraries" regarding the .net libraries and confirmed that the following python snippet applies in my case:

base = "C:/Program Files/PackageManagement/NuGet/Packages/Microsoft.AnalysisServices"
_version = "19.20.1.0"  # at time of this writing
AMO_PATH = f"{base}.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.Tabular.dll"
ADOMD_PATH = f"{base}.AdomdClient.retail.amd64.{_version}/lib/net45/Microsoft.AnalysisServices.AdomdClient.dll"

Do I need to copy this info into the ssas_api.py file in anyway for it to run?

from python-ssas.

yehoshuadimarsky avatar yehoshuadimarsky commented on July 30, 2024

yes that is the issue. your .Net DLLs are not being found. All of the functions optionally take params to specify where they are (passing to the underlying _load_assemblies decorator function), if no params are passed then it looks in the default locations, see the code in the snippet below.

python-ssas/ssas_api.py

Lines 47 to 59 in fc18937

# Full path of .dll files
root = Path(r"C:\Windows\Microsoft.NET\assembly\GAC_MSIL")
# get latest version of libraries if multiple libraries are installed (max func)
if amo_path is None:
amo_path = str(
max((root / "Microsoft.AnalysisServices.Tabular").iterdir())
/ "Microsoft.AnalysisServices.Tabular.dll"
)
if adomd_path is None:
adomd_path = str(
max((root / "Microsoft.AnalysisServices.AdomdClient").iterdir())
/ "Microsoft.AnalysisServices.AdomdClient.dll"
)

So if it's not finding it in the default location, you should pass the location as params, such as this

df = ssas_api.get_DAX(
    connection_string=conn, 
    dax_string=dax_string, 
    amo_path='C:/path/number/one',
    adomd_path='C:/path/number/two'
)

Example in the code/docstring:

python-ssas/ssas_api.py

Lines 83 to 96 in fc18937

Wrapper to make sure that required .NET assemblies have been loaded and imported.
Can pass the keyword arguments 'amo_path' and 'adomd_path' to any annotated function,
it will use them in the `_load_assemblies` function.
Example:
.. code-block:: python
import ssas_api
conn = ssas_api.set_conn_string(
's', 'd', 'u', 'p',
amo_path='C:/path/number/one',
adomd_path='C:/path/number/two'
)
"""

from python-ssas.

nisharncslabs avatar nisharncslabs commented on July 30, 2024

Ok, that makes sense however even when I used the file paths as specififed ie AMO_PATH & ADOMD_PATH in the fashion that you are suggesting (inside the get_DAX function) it stills throws the error. I have attached a screenshot of the error and the code chunk I think its referring too.

2021-05-05 (3)_LI
2021-05-05 (5)_LI

from python-ssas.

yehoshuadimarsky avatar yehoshuadimarsky commented on July 30, 2024

That doesn't make sense.

  • I don't see the exact error in the screen shot, what's the error?
  • what's your code?

from python-ssas.

nisharncslabs avatar nisharncslabs commented on July 30, 2024

The error that I am still getting is the same as what I described above.

  • After I tried the suggestion of forcefully specifying the file path of the .dll extension files it still comes up with the same error.
  • As you can see in the 1st screen shot the error (the stuff in red at the bottom) mentions NameError: DataTable not defined and if you look above you can see the greyed out code that I was mentioning. For some reason this code isn't being detected by Pycharm IDE not sure what I am missing or why this is but as a result it seems the ssas_api.py script doesn't work/execute as intended and hence cannot define the DataTable as the error says.
  • The 2nd screenshot shows the last part of the error (also described in an earlier post) about how the file path can't be found even though above I have forceibly specified the file path of the .dll files as you confirmed was what should be done.

from python-ssas.

dominik-hayd avatar dominik-hayd commented on July 30, 2024

Hey @nisharncslabs,
I know it's been a while since you created this issue and I'm not the owner of this project. But since i encountered the same issues I want to share how I solved them and my understanding on whats the problem in your case.

First of all: The lines 70-73 should be highlighted by your IDE. It reads these imports and looks for matching python packages, which are obviously not available. The imports are loaded through pythonnet from the .DLL files. But this is just a warning and can be safely ignored, because we know why they occur.

Now to your error message. I think python is a bit misleading in this case. It reports two exceptions:

  1. The NameError: DataTable not defined. The exception is thrown in line 102. @yehoshuadimarsky accesses the type of DataTable there to check whether the .Net dlls are already loaded. So at the first execution this check has to throw an exception to trigger the dll loading, which is done in the except block (line 107).
  2. The FileNotFound error. This comes from pythonnet trying to load one of the dll files. This is the real problem.

Because the second exception is thrown in the except block python shows you both exception.

Now to solve the problem:
(I can only tell you how I solved it 🙈)

I copied both .dll files (Tabular and AdomdClient) to the root directory of my project.
In my first call to python_ssas I passed absolute paths to both files.
conn = powerbi.set_conn_string('workspace', 'datset', 'user', 'password', amo_path='C:\\Users\\...\\Microsoft.AnalysisServices.Tabular.dll', adomd_path='C:\\Users\\...\\Microsoft.AnalysisServices.AdomdClient.dll')

For me this solves the error.
If this isn't working for you play around with importing the dlls for a bit:
Create a new python file to test just the import without an extra stuff.

import clr
clr.AddReference("System")
clr.AddReference("System.Data")
clr.AddReference(<path to your tabular dll>)
clr.AddReference(<path to your adomdclient dll>)

import System
from System.Data import DataTable
import Microsoft.AnalysisServices.Tabular as AMO
import Microsoft.AnalysisServices.AdomdClient as ADOMD

Adjust the paths until you get this script running and then supply the same paths to your first call to one of the python_ssas functions.

Hope this helps 😃

from python-ssas.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.