Navigation

    Quantiacs Community

    • Register
    • Login
    • Search
    • Categories
    • News
    • Recent
    • Tags
    • Popular
    • Users
    • Groups

    Why .interpolate_na dosen't work well ?

    Support
    4
    6
    410
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • C
      cyan.gloom last edited by

      Hello,

      I'd like to fill na in data by using .interpolate_na() like df.interpolate().

      Best regards,

      import xarray as xr
      
      import qnt.ta as qnta
      import qnt.data as qndata
      import qnt.output as qnout
      import qnt.stats as qns
      
      data = qndata.stocks.load_ndx_data(min_date="2005-06-01")
      close = data.sel(field='close')
      it = close.sel(asset=['NAS:AAPL','NAS:ADBE','NAS:ADI','NAS:ADSK','NAS:AKAM',
                                 'NAS:AMAT','NAS:AMD','NAS:AMZN','NAS:ANSS','NAS:ASML',
                                 'NAS:ATVI','NAS:BIDU','NAS:CDNS','NAS:CDW','NAS:CERN',
                                 'NAS:CHKP','NAS:CMCSA','NAS:CSCO','NAS:CTSH','NAS:CTXS',
                                 'NAS:DISCA','NAS:DISCK','NAS:DISH','NAS:EA','NAS:EBAY',
                                 'NAS:ERIC','NAS:EXPE','NAS:FB','NAS:FFIV','NAS:FISV',
                                 'NAS:FLEX','NAS:FLIR','NAS:FTNT','NAS:INTC','NAS:INTU',
                                 'NAS:JD','NAS:KLAC','NAS:LBTYA','NAS:LBTYK','NAS:LOGI',
                                 'NAS:LRCX','NAS:MCHP','NAS:MELI','NAS:META','NAS:MSFT',
                                 'NAS:MU','NAS:MXIM','NAS:NFLX','NAS:NTAP','NAS:NTES',
                                 'NAS:NUAN','NAS:NVDA','NAS:NXPI','NAS:PANW','NAS:PAYX',
                                 'NAS:QCOM','NAS:SIRI','NAS:SNPS','NAS:SPLK','NAS:SWKS',
                                 'NAS:TIGO','NAS:TMUS','NAS:TRIP','NAS:TTWO','NAS:TXN',
                                 'NAS:VOD','NAS:VRSN','NAS:WDAY','NAS:WDC','NAS:XLNX',
                                 'NYS:BB','NYS:INFY','NYS:JNPR','NYS:ORCL',])
      it = it.interpolate_na(dim = 'time',method='linear')
      it
      
      1 Reply Last reply Reply Quote 0
      • support
        support last edited by

        Ok, can you let us know more details? What is the exact problem? Best regards

        C 1 Reply Last reply Reply Quote 0
        • C
          cyan.gloom @support last edited by

          @support

          After I applied .dropna() to 'it', some information related with the time from '2005-06-01' is deleted regardless of interpolating nan.

          What make this happen do you think ?

          1 Reply Last reply Reply Quote 0
          • A
            antinomy last edited by

            @cyan-gloom
            interpolate_na() only eliminates NaNs between 2 valid data points. Take a look at this example:

            import qnt.data as qndata
            import numpy as np
            
            stocks = qndata.stocks_load_ndx_data()
            sample = stocks[:, -5:, -6:] # The latest 5 dates for the last 6 assets
            
            print(sample.sel(field='close').to_pandas())
            """
            asset       NYS:NCLH  NYS:ORCL  NYS:PRGO  NYS:QGEN  NYS:RHT  NYS:TEVA
            time                                                                 
            2023-05-12     13.24     97.85     35.21     45.09      NaN      8.03
            2023-05-15     13.71     97.26     34.23     45.36      NaN      8.07
            2023-05-16     13.48     98.25     32.84     45.25      NaN      8.13
            2023-05-17     14.35     99.77     32.86     44.95      NaN      8.13
            2023-05-18     14.53    102.34     33.43     44.92      NaN      8.26
            """
            
            # Let's add some more NaN values:
            sample.values[3, (1,3), 0] = np.nan
            sample.values[3, 1:4, 1] = np.nan
            sample.values[3, :2, 2] = np.nan
            sample.values[3, 2:, 3] = np.nan
            sample.values[3, :-1, 5] = np.nan
            print(sample.sel(field='close').to_pandas())
            """
            asset       NYS:NCLH  NYS:ORCL  NYS:PRGO  NYS:QGEN  NYS:RHT  NYS:TEVA
            time                                                                 
            2023-05-12     13.24     97.85       NaN     45.09      NaN       NaN
            2023-05-15       NaN       NaN       NaN     45.36      NaN       NaN
            2023-05-16     13.48       NaN     32.84       NaN      NaN       NaN
            2023-05-17       NaN       NaN     32.86       NaN      NaN       NaN
            2023-05-18     14.53    102.34     33.43       NaN      NaN      8.26
            """
            
            # Interpolate the NaN values:
            print(sample.interpolate_na('time').sel(field='close').to_pandas())
            """
            asset       NYS:NCLH    NYS:ORCL  NYS:PRGO  NYS:QGEN  NYS:RHT  NYS:TEVA
            time                                                                   
            2023-05-12    13.240   97.850000       NaN     45.09      NaN       NaN
            2023-05-15    13.420  100.095000       NaN     45.36      NaN       NaN
            2023-05-16    13.480  100.843333     32.84       NaN      NaN       NaN
            2023-05-17    14.005  101.591667     32.86       NaN      NaN       NaN
            2023-05-18    14.530  102.340000     33.43       NaN      NaN      8.26
            """
            

            As you can see, only the NaNs in the first 2 columns are being replaced. The others remain untouched and might be dropped when you use dropna().

            Another thing you should keep in mind is that you might introduce lookahead bias with interpoloation, e. g. in a single run backtest. In my example for instance (pretend the NaNs I added were already in the data) you would know on 2023-05-15 that ORCL will rise when in reality you would first know that on 2023-05-18.

            C 1 Reply Last reply Reply Quote 1
            • C
              cyan.gloom @antinomy last edited by

              @antinomy

              I got it !
              Thanks a lot !!

              1 Reply Last reply Reply Quote 0
              • P
                plum.dodrio last edited by

                This post is deleted!
                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                Powered by NodeBB | Contributors
                • Documentation
                • About
                • Career
                • My account
                • Privacy policy
                • Terms and Conditions
                • Cookies policy
                Home
                Copyright © 2014 - 2021 Quantiacs LLC.
                Powered by NodeBB | Contributors