Ticket #4206 (closed enhancement: fixed)

Opened 3 months ago

Last modified 1 month ago

[with patch, positive review] convert RDF and CDF vectors to use numpy

Reported by: jason Assigned to: jason
Priority: major Milestone: sage-3.2.2
Component: linear algebra Keywords:
Cc:

Attachments

vector-RDF-CDF-numpy.patch (81.1 kB) - added by jason on 11/24/2008 08:16:28 AM.
vector-rdf-doctest-correction.patch (0.7 kB) - added by jason on 11/24/2008 12:28:06 PM.
vector-RDF-CDF-numpy.2.patch (81.1 kB) - added by jason on 11/26/2008 09:40:45 AM.
Rebased for sage-3.2.1.alpha1
vector-RDF-CDF-numpy-sage-3.2.1.alpha2.patch (81.7 kB) - added by jason on 11/27/2008 06:57:37 PM.
v.sobj (356 bytes) - added by was on 11/28/2008 02:07:01 PM.
Reviewer -- make sure that the attached file v.sobj can be loaded after this patch is applied and all traces of old real_double_vector* stuff is deleted from SAGE_ROOT/devel/sage/build/…
real_roots-import.patch (0.7 kB) - added by jason on 12/06/2008 07:38:20 PM.
vector_fft.patch (3.4 kB) - added by jason on 12/08/2008 10:48:12 PM.
apply on top of previous patches
vector-RDF-CDF-numpy-sage-3.2.1.patch (86.6 kB) - added by jason on 12/09/2008 12:05:55 AM.
timeseries-64bit.patch (0.8 kB) - added by jason on 12/09/2008 12:45:03 AM.
apply on top of previous patches

Change History

09/27/2008 02:44:21 AM changed by mabshoff

This is needed after #4205:

[02:20am] mabshoff: mk
[02:21am] jason-:         cdef ndarray _n = numpy.array(list(self),dtype=numpy.dtype('float64'))
[02:21am] jason-: #        temp = PyArray_SimpleNew(1, dims, NPY_DOUBLE)
[02:21am] jason-: #        _n = temp
[02:21am] jason-: #        for i from 0<=i<_V.v.size:
[02:21am] jason-: #            _n[i] = float(_V.v.data[i])
[02:21am] jason-: comment from that line down to the end of the function
[02:21am] jason-: and replace it with the above numpy.array line.
[02:21am] jason-: oh, and add import numpy above that.
<SNIP>
[02:24am] mabshoff: sage: sage: v = vector(RDF,4,range(4))
[02:24am] mabshoff: sage: v.numpy()
[02:24am] mabshoff: array([ 0.,  1.,  2.,  3.])
[02:24am] mabshoff: sage: v
[02:24am] mabshoff: (0.0, 1.0, 2.0, 3.0)
[02:24am] mabshoff: so that works now
[02:24am] jason-: can I just replace it?
[02:24am] jason-: it's going to be really slow, but it works
[02:25am] mabshoff: well, I guess for now we can live with that.
[02:25am] jason-: well, I don't know how much slower
[02:25am] jason-: can you do a timeit test?
[02:25am] mabshoff: sure, in a minute

Cheers,

Michael

09/27/2008 02:46:53 AM changed by mabshoff

And this is needed in

Look for the two comments like:

        # We should be using the C API here.  When upgrading to 1.2.0,
        # we didn't get the C API to work for some reason on sage.math
        # (64-bit) even though it worked fine on a 32-bit box.
        cdef ndarray _n = numpy.array(list(self),dtype=numpy.dtype('float64'))

10/25/2008 01:30:27 AM changed by jason

I've attached some rough initial work based on the patches to RDF and CDF matrices. Probably none of the docstrings are correct. This patch will generate C code (in Sage 3.1.4), but won't compile.

11/11/2008 01:39:48 AM changed by jason

Updated code; now sage builds and passes all doctests in the old *_double_vector.pyx files.

TODO:

  • Make the real_roots and timeseries files use the Cython buffer interface
  • Update doctests to be for the vector code
  • One last runthrough to make sure the code looks reasonable
  • doctest everything.

11/11/2008 01:40:05 AM changed by jason

  • summary changed from convert RDF and CDF vectors to use numpy to [with patch, needs work] convert RDF and CDF vectors to use numpy.

(follow-up: ↓ 7 ) 11/11/2008 12:06:31 PM changed by was

Make the real_roots and timeseries files use the Cython buffer interface

If you do that to timeseries, please do not degrade performance at all. The whole point of timeseries is "blazing speed" -- many of the functions are easily 10-15 times faster than the same functions in matlab/numpy, and part of this is because of using a simple data structure (double*).

(in reply to: ↑ 6 ) 11/11/2008 12:33:14 PM changed by mabshoff

Replying to was:

Make the real_roots and timeseries files use the Cython buffer interface

If you do that to timeseries, please do not degrade performance at all. The whole point of timeseries is "blazing speed" -- many of the functions are easily 10-15 times faster than the same functions in matlab/numpy, and part of this is because of using a simple data structure (double*).

If I understand this patch correctly the buffer interface in Cython to numpy avoids copying data. Since the data are stored as double* there should be a small increase in performance and the use of the buffer interface does not imply that functionality gets moved from something else to numpy.

Cheers,

Michael

11/11/2008 01:16:40 PM changed by jason

Michael,

Yep, that's the idea. If the buffer interface doesn't work, then just work with the raw double C array that underlies a (sufficiently nice) numpy array.

The patch currently does horrible performance things with timeseries just to get it to not depend on gsl. The final version will access the numpy double array either using the buffer interface (preferably) or with straight C.

11/11/2008 01:31:33 PM changed by jason

I guess I should make clear that the only reason I am touching the time series and the real roots files are because they rather heavily used the GSL structures underlying RDF and CDF vectors. If the authors want to change it to numpy or to their own double array structure, they are more than welcome. Needless to say, if I end up doing it, we should call for the original authors' comments before putting this patch in.

11/11/2008 07:59:02 PM changed by was

This is pretty painful to see. Basically your patch will make create a real_double_dense vector from a TimeSeries? extremely slow. (I.e., the function .vector() becomes way slower.) Fortunately, it appears I don't use that function anywhere else in the file. It would be good for you to either fix this, or put a large WARNING in the docstring for that function that it is slow and a pointer to a trac ticket about fixing that problem. I say this because one of the design constraints (clearly stated I hope in the time_series.pyx file) is that every function in there be nearly "optimal".

220	 	        cdef RealDoubleVectorSpaceElement x = RealDoubleVectorSpaceElement(V, 0) 
221	 	        memcpy(x.v.data, self._values, sizeof(double)*self._length) 
 	222	        cdef Vector_real_double_dense x = Vector_real_double_dense(V, 0) 
 	223	        cdef int i 
 	224	        for i from 0 <= i < self._length: 
 	225	            x.set_unsafe(i, self._values[i]) 
 	226	#        cdef int i 
 	227	#        for i from 0<=i<self._length: 
 	228	#            x.set_unsafe(i,self._values[i]) 
 	229	 
 	230	#        memcpy(x._matrix_numpy.data, self._values, sizeof(double)*self._length) 

11/12/2008 06:52:49 AM changed by jason

I guess the other thing I should make perfectly clear is that the changes I made to the timeseries and the real_roots files were *only* to get sage -br to work so that I could concentrate on the doctests in the vector code. If the authors don't beat me to it, I plan to go through each of those changes and make them optimally access the numpy array (which should be just about as fast as the previous C double array access to the GSL vector). That's why the patch is "needs work"--it's a work-in-progress that I wanted to post to trac to (a) back up and (b) expose to a wider audience.

That said, I really appreciate your comments. I'll pay particular attention to optimizing these (if someone else doesn't beat me to it).

11/22/2008 02:53:17 PM changed by jason

Okay, after spending some time optimizing the time_series.pyx code, we are *faster* (for interesting big cases):

before patch:

sage: v = finance.TimeSeries([1..1e3])
sage: %timeit v.vector()
10000 loops, best of 3: 132 µs per loop
sage: v = finance.TimeSeries([1..1e4])
sage: %timeit v.vector()
1000 loops, best of 3: 198 µs per loop
sage: v = finance.TimeSeries([1..1e5])
sage: %timeit v.vector()
100 loops, best of 3: 2.02 ms per loop
sage: v = finance.TimeSeries([1..1e6])
%timeit v.vector()
sage: %timeit v.vector()
10 loops, best of 3: 17.1 ms per loop

after patch:

sage: v = finance.TimeSeries([1..1e3])
sage: %timeit v.vector()
10000 loops, best of 3: 148 µs per loop
sage: v = finance.TimeSeries([1..1e4])
sage: %timeit v.vector()
1000 loops, best of 3: 222 µs per loop
sage: v = finance.TimeSeries([1..1e5])
sage: %timeit v.vector()
1000 loops, best of 3: 1.26 ms per loop
sage: v = finance.TimeSeries([1..1e6])
sage: %timeit v.vector()
100 loops, best of 3: 10.1 ms per loop

11/22/2008 10:45:03 PM changed by jason

Okay, I updated all the docstrings; all tests pass.

My one concern is that in sage/rings/polynomial/real_roots.pyx, I tried to use the new cython buffer interface, but at compile-time, I got a maximum recursion depth exceeded error in the find_buffer_type function (I think that was the name). Right now, real_roots uses the slow standard python getitem method.

Either the Cython buffer issue should be fixed or we should make the real_roots access the vector in a faster way (preferably using the numpy C API or exposing that with, say, a get_unsafe_python function in the vector_double_dense file).

11/22/2008 10:47:22 PM changed by jason

This patch depends on #4570.

11/24/2008 08:15:21 AM changed by jason

  • summary changed from [with patch, needs work] convert RDF and CDF vectors to use numpy to [with patch, needs review] convert RDF and CDF vectors to use numpy.

Regarding the max recursion error mentioned above:

Dag Sverre Seljebotn wrote:
> Seems like an issue with circular cimports? Which means those aren't tested in the test framework. Anyway I was unable to provoke the behaviour with a testcase myself, but "coding in blind" then I assume that the following patch should fix it.
>
> Jason: To answer the question in the wiki about what to do for SAGE, I assume that we can quite quickly release a Cython 0.10.2 that incorporates this patch. Though Robert would be the one to give a real answer.
>
> diff -r 04e83ffd8ea2 Cython/Compiler/Buffer.py
> --- a/Cython/Compiler/Buffer.py Fri Nov 07 06:55:37 2008 +0100
> +++ b/Cython/Compiler/Buffer.py Sun Nov 23 16:58:15 2008 +0100
> @@ -710,7 +710,11 @@ def use_py2_buffer_functions(env):
>
>      # Search all types for __getbuffer__ overloads
>      types = []
> +    visited_scopes = set()
>      def find_buffer_types(scope):
> +        if scope in visited_scopes:
> +            return
> +        visited_scopes.add(scope)
>          for m in scope.cimported_modules:
>              find_buffer_types(m)
>          for e in scope.type_entries:
>

This patch made it compile and all the doctests pass.

Can we get this patch (or an equivalent one) into Sage as soon as possible?


Thanks,

Jason

11/24/2008 08:15:54 AM changed by jason

(when I say "all the doctests", I mean all the doctests in real_roots.pyx)

11/24/2008 08:16:28 AM changed by jason

  • attachment vector-RDF-CDF-numpy.patch added.

11/24/2008 08:19:31 AM changed by jason

  • owner changed from was to jason.
  • status changed from new to assigned.

To review this patch:

  1. Start with sage-3.2
  2. Apply the patch at #4570
  3. Apply the following patch to /sage/local/lib/python2.5/site-packages/Cython/Compiler/Buffer.py:
diff -r 04e83ffd8ea2 Cython/Compiler/Buffer.py
--- a/Cython/Compiler/Buffer.py Fri Nov 07 06:55:37 2008 +0100
+++ b/Cython/Compiler/Buffer.py Sun Nov 23 16:58:15 2008 +0100
@@ -710,7 +710,11 @@ def use_py2_buffer_functions(env):
     # Search all types for __getbuffer__ overloads
     types = []
+    visited_scopes = set()
     def find_buffer_types(scope):
+        if scope in visited_scopes:
+            return
+        visited_scopes.add(scope)
         for m in scope.cimported_modules:
             find_buffer_types(m)
         for e in scope.type_entries:
  1. Of course, do sage -br

11/24/2008 08:28:25 AM changed by jason

Oh, yeah, and step 3.5: apply the patch on this ticket :).

11/24/2008 12:28:06 PM changed by jason

  • attachment vector-rdf-doctest-correction.patch added.

11/24/2008 12:28:53 PM changed by jason

With the second patch on this ticket, "make test" passes in sage 3.2.

11/24/2008 02:03:58 PM changed by jason

The patch on this ticket solves #4491 as well.

11/26/2008 09:28:46 AM changed by jason

I'm rebasing this to merge after #4580 has been merged.

11/26/2008 09:40:45 AM changed by jason

  • attachment vector-RDF-CDF-numpy.2.patch added.

Rebased for sage-3.2.1.alpha1

11/26/2008 09:42:35 AM changed by jason

New instructions that should work for sage-3.2.1.alpha1:

  1. Apply, in order, vector-RDF-CDF-numpy.2.patch and vector-rdf-doctest-correction.patch
  1. Do sage -br

Enjoy!

11/26/2008 09:43:59 AM changed by jason

With either sage version, you might have to delete the Cython cache by removing the file SAGE_ROOT/devel/sage/.cython_deps before doing sage -br. This is because I remove a Cython file in the patch.

11/26/2008 09:29:49 PM changed by was

  • summary changed from [with patch, needs review] convert RDF and CDF vectors to use numpy to [with patch, needs works] convert RDF and CDF vectors to use numpy.

Needs work. The top of the file that implements complex double vectors starts out like this (matrices not vectors, and deletes all the old Complex double vector stuff).

        1       """ 
 	2	Dense matrices over the Real Double Field. 
 	3	 
 	4	Matrix operations using numpy. 
 	5	 
 	6	EXAMPLES: 
 	7	    sage: b=Mat(RDF,2,3).basis() 
 	8	    sage: b[0] 
 	9	    [1.0 0.0 0.0] 
 	10	    [0.0 0.0 0.0] 
 	11	 
 	12	 
 	13	We deal with the case of zero rows or zero columns: 
 	14	    sage: m = MatrixSpace(RDF,0,3) 
 	15	    sage: m.zero_matrix() 
 	16	    [] 
 	17	 
 	18	TESTS: 
 	19	    sage: a = matrix(RDF,2,range(4), sparse=False) 
 	20	    sage: loads(dumps(a)) == a 
 	21	    True 
 	22	 
 	23	AUTHORS: 
 	24	    -- Jason Grout, Sep 2008: switch to numpy backend 
 	25	    -- Josh Kantor 
 	26	    -- William Stein: many bug fixes and touch ups. 
 	27	""" 
 	28	 
 	29	############################################################################## 
 	30	#       Copyright (C) 2004,2005,2006 Joshua Kantor <kantor.jm@gmail.com> 
 	31	#  Distributed under the terms of the GNU General Public License (GPL) 
 	32	#  The full text of the GPL is available at: 
 	33	#                  http://www.gnu.org/licenses/ 
 	34	############################################################################## 
 	35	from sage.rings.complex_double import CDF 
 	36	 
 	37	cimport sage.ext.numpy as cnumpy 
 	38	 
 	39	numpy=None 
 	40	 
 	41	cdef class Vector_complex_double_dense(vector_double_dense.Vector_double_dense): 

11/27/2008 07:22:30 AM changed by jason

Apparently I concentrated so much on making the documentation in vector_double_dense.pyx good that I forgot to make sure the documentation in subclasses was up to par. I based the implementation on CDF/RDF matrix code, but apparently forgot to change the docs. I'll post a new patch today.

11/27/2008 06:57:37 PM changed by jason

  • attachment vector-RDF-CDF-numpy-sage-3.2.1.alpha2.patch added.

11/27/2008 06:59:41 PM changed by jason

  • summary changed from [with patch, needs works] convert RDF and CDF vectors to use numpy to [with patch, needs review] convert RDF and CDF vectors to use numpy.

The patch vector-RDF-CDF-numpy-sage-3.2.1.alpha2.patch has been rebased to sage-3.2.1.alpha2 and takes care of the previous concerns.

11/28/2008 01:55:42 PM changed by was

  • summary changed from [with patch, needs review] convert RDF and CDF vectors to use numpy to [with patch, needs work] convert RDF and CDF vectors to use numpy.

It doesn't build for me:

was@sage:~/build/sage-3.2.1.alpha1$ ./sage
----------------------------------------------------------------------
| Sage Version 3.2.1.alpha1, Release Date: 2008-11-26                |
| Type notebook() for the GUI, and license() for information.        |
----------------------------------------------------------------------
hg_sage.revesage: hg_sage.revert('--all')
cd "/home/was/build/sage-3.2.1.alpha1/devel/sage" && hg revert --all
reverting module_list.py
forgetting sage/groups/multiplicative_wrapper.pxd
forgetting sage/groups/multiplicative_wrapper.pyx
sage: hg_sage.apply('http://trac.sagemath.org/sage_trac/attachment/ticket/4206/vector-RDF-CDF-numpy-sage-3.2.1.alpha2.patch')
Attempting to load remote file: http://trac.sagemath.org/sage_trac/attachment/ticket/4206/vector-RDF-CDF-numpy-sage-3.2.1.alpha2.patch?format=raw
Loading: [...........]
cd "/home/was/build/sage-3.2.1.alpha1/devel/sage" && hg status
cd "/home/was/build/sage-3.2.1.alpha1/devel/sage" && hg status
cd "/home/was/build/sage-3.2.1.alpha1/devel/sage" && hg import   "/home/was/.sage/temp/sage/15486/tmp_0.patch"
applying /home/was/.sage/temp/sage/15486/tmp_0.patch
sage: 
Exiting SAGE (CPU time 0m0.12s, Wall time 0m19.72s).
was@sage:~/build/sage-3.2.1.alpha1$ ./sage -br

----------------------------------------------------------
sage: Building and installing modified Sage library files.


Installing c_lib
scons: `install' is up to date.
Updating Cython code....
Building modified file sage/modules/free_module_element.pyx.
Building modified file sage/modules/vector_double_dense.pyx.
Building modified file sage/modules/vector_complex_double_dense.pyx.
Building modified file sage/modules/vector_real_double_dense.pyx.
Building modified file sage/matrix/matrix_double_dense.pyx.
Building sage/matrix/matrix_real_double_dense.pyx because it depends on sage/ext/numpy.pxd.
Building sage/matrix/change_ring.pyx because it depends on sage/ext/numpy.pxd.
Building sage/matrix/matrix_complex_double_dense.pyx because it depends on sage/ext/numpy.pxd.
Traceback (most recent call last):
  File "setup.py", line 486, in <module>
    queue = compile_command_list(ext_modules, deps)
  File "setup.py", line 456, in compile_command_list
    dep_file, dep_time = deps.newest_dep(f)
  File "setup.py", line 371, in newest_dep
    for f in self.all_deps(filename):
  File "setup.py", line 354, in all_deps
    deps.update(self.all_deps(f, path))
  File "setup.py", line 352, in all_deps
    for f in self.immediate_deps(filename):
  File "setup.py", line 334, in immediate_deps
    self._deps[filename] = self.parse_deps(filename)
  File "setup.py", line 290, in parse_deps
    f = open(filename)
IOError: [Errno 2] No such file or directory: 'sage/modules/real_double_vector.pxd'
sage: There was an error installing modified sage library code.

11/28/2008 01:59:33 PM changed by was

The above is a bug in the cython dependency checking code. I'll do a full rebuild.

That said, I can't imagine that unpickling real double vectors could work because that should call the real_double_vector module. We shall see :-)

11/28/2008 02:04:58 PM changed by was

This patch seems to totally get rid of the real_double_vector module. After applying it and making sure to delete all traces of it from the build directory, I get the following on startup of sage:

{{ /home/was/build/sage-3.2.1.alpha1/local/lib/python2.5/site-packages/sage/modules/free_quadratic_module.py in <module>()

92 import sage.rings.integer 93 import sage.structure.parent_gens as gens

---> 94 import sage.modules.real_double_vector

95 import sage.modules.complex_double_vector 96 from sage.misc.randstate import current_randstate

ImportError?: No module named real_double_vector }}}

11/28/2008 02:07:01 PM changed by was

  • attachment v.sobj added.

Reviewer -- make sure that the attached file v.sobj can be loaded after this patch is applied and all traces of old real_double_vector* stuff is deleted from SAGE_ROOT/devel/sage/build/...

12/03/2008 09:31:58 AM changed by jason

The updated (sage-3.2.1) patch deletes the real_double_vector.pyx and complex_double_vector.pyx files and instead creates (real|complex)_double_vector.py files, which basically contain references to the unpickling functions and make the old classes aliases for the new classes. We could just as well make the old classes = None (and I think unpickling old things will still work), but this way there is some sort of backwards compatibility for people still using the old class names.

12/03/2008 09:48:32 AM changed by jason

  • summary changed from [with patch, needs work] convert RDF and CDF vectors to use numpy to [with patch, needs review] convert RDF and CDF vectors to use numpy.

I updated the patch to address a doctest failure. The updated patch now passes doctests in sage/modules/*.(py|pyx) and also passes doctests in sage/structure/sage_object.pyx.

This should be ready to be reviewed again.

The only patch you need to apply is vector-RDF-CDF-numpy-sage-3.2.1.patch to sage 3.2.1.

12/03/2008 09:53:38 AM changed by jason

To review this patch, you need to delete the real_double_vector.* and complex_double_vector.* files out of sage/build/ directories. One way to do this is:

cd $SAGE_ROOT/devel/sage/build find . | grep real_double_vector | xargs rm find . | grep complex_double_vector | xargs rm

Then do sage -br after applying the patch. Then you should be ready to test the patch.

12/04/2008 06:45:29 PM changed by cwitty

Jason asked me to look in particular at the real_roots.pyx parts of this patch, so I did:

The changes to real_roots.pyx look good. Also, I ran a couple of my standard benchmarks, and the pre- and post-patch timings are not measurably different.

Positive review for real_roots.pyx portion of the patch.

12/05/2008 09:23:07 PM changed by jason

A reminder that once this is closed, close #4491 as well.

12/06/2008 02:49:13 PM changed by was

  • summary changed from [with patch, needs review] convert RDF and CDF vectors to use numpy to [with patch, needs work] convert RDF and CDF vectors to use numpy.

I was unable to build this after applying it to sage-3.2.2.alpha0:

building 'sage.rings.polynomial.real_roots' extension
gcc -fno-strict-aliasing -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/was/build/sage-3.2.2.alpha0/local//include -I/home/was/build/sage-3.2.2
.alpha0/local//include/csage -I/home/was/build/sage-3.2.2.alpha0/devel//sage/sage/ext -I/home/was/build/sage-3.2.2.alpha0/local/include/python2.5 -c sage/rings/
polynomial/real_roots.c -o build/temp.linux-x86_64-2.5/sage/rings/polynomial/real_roots.o -w
sage/rings/polynomial/real_roots.c:119:31: error: numpy/arrayobject.h: No such file or directory
sage/rings/polynomial/real_roots.c:388: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_int8_t’
sage/rings/polynomial/real_roots.c:390: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_int16_t’
sage/rings/polynomial/real_roots.c:392: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_int32_t’
sage/rings/polynomial/real_roots.c:394: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_int64_t’
sage/rings/polynomial/real_roots.c:396: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_uint8_t’
sage/rings/polynomial/real_roots.c:398: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_uint16_t’
sage/rings/polynomial/real_roots.c:400: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_uint32_t’
sage/rings/polynomial/real_roots.c:402: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_uint64_t’
sage/rings/polynomial/real_roots.c:404: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_float32_t’
sage/rings/polynomial/real_roots.c:406: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_float64_t’
sage/rings/polynomial/real_roots.c:408: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_int_t’
sage/rings/polynomial/real_roots.c:410: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_long_t’
sage/rings/polynomial/real_roots.c:412: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_uint_t’
sage/rings/polynomial/real_roots.c:414: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_ulong_t’
sage/rings/polynomial/real_roots.c:416: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_float_t’
sage/rings/polynomial/real_roots.c:418: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_double_t’
sage/rings/polynomial/real_roots.c:420: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_longdouble_t’
sage/rings/polynomial/real_roots.c:422: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_cfloat_t’
sage/rings/polynomial/real_roots.c:424: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_cdouble_t’
sage/rings/polynomial/real_roots.c:426: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘__pyx_t_5numpy_clongdouble_t’
sage/rings/polynomial/real_roots.c:998: error: field ‘_numpy_dtypeint’ has incomplete type
sage/rings/polynomial/real_roots.c:1002: error: expected specifier-qualifier-list before ‘PyArrayObject’
sage/rings/polynomial/real_roots.c:2274: error: expected declaration specifiers or ‘...’ before ‘PyArrayObject’

This is on 64-bit sage.math.

12/06/2008 03:55:58 PM changed by jkantor

I get the same error stream as was on intel atom.

12/06/2008 07:38:20 PM changed by jason

  • attachment real_roots-import.patch added.

12/06/2008 07:41:54 PM changed by jason

  • summary changed from [with patch, needs work] convert RDF and CDF vectors to use numpy to [with patch, needs review] convert RDF and CDF vectors to use numpy.

I attached a patch which should take care of the problem. The problem was manifest on a system that does not have a system numpy; the include directories were not set up properly for real_roots.pyx. So now, apply vector-RDF-CDF-numpy-sage-3.2.1.patch and then apply real_roots-import.patch.

Let me know if the problem still persists.

12/07/2008 12:07:41 AM changed by jkantor

  • summary changed from [with patch, needs review] convert RDF and CDF vectors to use numpy to [with patch, with positive review after fft timing regression is addressed] convert RDF and CDF vectors to use numpy.

Overall the design looks good to me and the interface with numpy looks good. A few things I noticed.

1. For large arrays around a million, object creation is ever so slightly slower than the old class it seems, I'm guessing thats numpy's falt, there is maybe nothing you can do about that.

2. More seriously the fft function is quite a bit slower. For example it takes about 15 seconds to do fft on a vector of lenth a million. Scipy only takes 0.72 to do the fft, and the old class also only took about a second. I'll look and see if I have any suggestions about the bottle neck, this probably should be fixed. It seemed most other operations were about the same speed but it might be good to double check this.

3. One suggestion, I think it would cool if in vector_double_dense.pxd you make the _vector_numpy array public as in

cdef public cnumpy.ndarray _vector_numpy

then it is possible to do something like

v=vector(RDF,[1,2,3])
v._vector_numpy[0]=3

this could be useful to pass something to scipy without having to copy. You could make it readonly instead of public if you don't want people to be able to modify through that mechanism.

12/07/2008 11:19:56 AM changed by was

Scipy only takes 0.72 to do the fft,

Why don't we use scipy to do the fft, then?

12/08/2008 05:22:26 AM changed by jason

This is a little puzzling since the code *does* use scipy to do fft. Here is the relevant call:

            return v._new(scipy.fft(v._vector_numpy))

The thing that I see is that above this, I do:

        cdef Vector_double_dense v = self.complex_vector()

I think the wasted time is in one of those two calls. I can look at this later.

12/08/2008 10:34:58 AM changed by jason

  • summary changed from [with patch, with positive review after fft timing regression is addressed] convert RDF and CDF vectors to use numpy to [with patch, needs review] convert RDF and CDF vectors to use numpy.

I changed the behavior of the fft function in the vector_fft patch. It now does a complex fft over a CDF vector, but a *real* fft over an RDF vector. I also streamlined it, so now we get the following timings (after the patch):

sage: v=vector(RDF, range(1000000r))
sage: n=v.numpy()
sage: import scipy; import scipy.fftpack
sage: timeit('a=v.fft()')
5 loops, best of 3: 96.8 ms per loop
sage: timeit('b=scipy.fftpack.rfft(n)')
5 loops, best of 3: 98.3 ms per loop
sage: timeit('a=v.inv_fft()')
5 loops, best of 3: 106 ms per loop
sage: timeit('b=scipy.fftpack.irfft(n)')
5 loops, best of 3: 106 ms per loop
sage: v=vector(CDF, range(1000000r))
sage: n=v.numpy()
sage: timeit('a=v.fft()')
5 loops, best of 3: 182 ms per loop
sage: timeit('b=scipy.fftpack.fft(n)')
5 loops, best of 3: 181 ms per loop
sage: timeit('a=v.inv_fft()')
5 loops, best of 3: 207 ms per loop
sage: timeit('b=scipy.fftpack.ifft(n)')
5 loops, best of 3: 207 ms per loop

Since the behavior of the fft function is changed, I'm putting this as "needs review". What I'm looking for is someone to pass off on the use of the real fft for RDF vectors.

12/08/2008 03:08:30 PM changed by jason

On IRC, we decided to instead follow the convention in matlab and return a complex vector (and do a complex fft), even if we start out with a real vector.

So I'll make that change and then consider this a positive review (since then the functionality won't have changed, but the timing will still be good).

12/08/2008 10:48:12 PM changed by jason

  • attachment vector_fft.patch added.

apply on top of previous patches

12/08/2008 10:49:36 PM changed by jason

  • summary changed from [with patch, needs review] convert RDF and CDF vectors to use numpy to [with patch, positive review] convert RDF and CDF vectors to use numpy.

Okay, updated the vector_fft.patch. The new timing comparison is:

sage: v=vector(RDF, range(1000000r))
n=v.numpy()
sage: n=v.numpy()
sage: import scipy; import scipy.fftpack
sage: timeit('a=v.fft()')
5 loops, best of 3: 342 ms per loop
sage: timeit('b=scipy.fft(n)')
5 loops, best of 3: 316 ms per loop
sage: timeit('a=v.inv_fft()')
5 loops, best of 3: 442 ms per loop
sage: timeit('b=scipy.ifft(n)')
5 loops, best of 3: 416 ms per loop
sage: v=vector(CDF, range(1000000r))
sage: n=v.numpy()
sage: timeit('a=v.fft()')
5 loops, best of 3: 346 ms per loop
sage: timeit('b=scipy.fft(n)')
5 loops, best of 3: 319 ms per loop
sage: timeit('a=v.inv_fft()')
5 loops, best of 3: 470 ms per loop
sage: timeit('b=scipy.ifft(n)')
5 loops, best of 3: 419 ms per loop

Since the timing issue is addressed, I'm marking this as positive review as directed in jkantor's comment.

12/08/2008 11:43:44 PM changed by jkantor

Positive review. I would like the making the internal numpy array cdef public as a separate minor patch after this initial one is accepted.

12/08/2008 11:55:57 PM changed by mabshoff

There is some slight reject issue with the first patch:

***************
*** 93,100 ****
  import sage.rings.integer
  from sage.rings.real_double import RDF
  from sage.rings.complex_double import CDF
- 
- #from sage.matrix.matrix cimport Matrix
  
  def is_FreeModuleElement(x):
      return isinstance(x, FreeModuleElement)
--- 93,98 ----
  import sage.rings.integer
  from sage.rings.real_double import RDF
  from sage.rings.complex_double import CDF
  
  def is_FreeModuleElement(x):
      return isinstance(x, FreeModuleElement)

Cheers,

Michael

12/09/2008 12:05:55 AM changed by jason

  • attachment vector-RDF-CDF-numpy-sage-3.2.1.patch added.

12/09/2008 12:06:40 AM changed by jason

deleted reject hunk from file; the three patches should work now.

12/09/2008 12:26:12 AM changed by mabshoff

  • summary changed from [with patch, positive review] convert RDF and CDF vectors to use numpy to [with patch, needs work] convert RDF and CDF vectors to use numpy.

This patch needs doctest fixes. This is on sage.math:

sage -t -long "devel/sage/sage/finance/fractal.pyx"         

drfft:howmany=726335489
drfft:howmany=644245095
**********************************************************************
File "/scratch/mabshoff/release-cycle/sage-3.2.2.alpha1/devel/sage/sage/finance/fractal.pyx", line 109, in __main__.example_1
Failed example:
    sim = finance.stationary_gaussian_simulation(s, N)[Integer(0)]###line 66:_sage_    >>> sim = finance.stationary_gaussian_simulation(s, N)[0]
Exception raised:
    Traceback (most recent call last):
      File "/scratch/mabshoff/release-cycle/sage-3.2.2.alpha1/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/scratch/mabshoff/release-cycle/sage-3.2.2.alpha1/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/scratch/mabshoff/release-cycle/sage-3.2.2.alpha1/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_1[6]>", line 1, in <module>
        sim = finance.stationary_gaussian_simulation(s, N)[Integer(0)]###line 66:_sage_    >>> sim = finance.stationary_gaussian_simulation(s, N)[0]
      File "fractal.pyx", line 111, in sage.finance.fractal.stationary_gaussian_simulation (sage/finance/fractal.c:642)
      File "time_series.pyx", line 2172, in sage.finance.time_series.TimeSeries.fft (sage/finance/time_series.c:12114)
      File "/scratch/mabshoff/release-cycle/sage-3.2.2.alpha1/local/lib/python2.5/site-packages/scipy/fftpack/basic.py", line 179, in rfft
        return _raw_fft(tmp,n,axis,1,overwrite_x,work_function)
      File "/scratch/mabshoff/release-cycle/sage-3.2.2.alpha1/local/lib/python2.5/site-packages/scipy/fftpack/basic.py", line 49, in _raw_fft
        r = work_function(x,n,direction,overwrite_x=overwrite_x)
    error: (n*howmany==size(x)) failed for hidden howmany
**********************************************************************
<SNIP>

And

sage -t -long "devel/sage/sage/finance/time_series.pyx"     
**********************************************************************
File "/scratch/mabshoff/release-cycle/sage-3.2.2.alpha1/devel/sage/sage/finance/time_series.pyx", line 427, in __main__.example_15
Failed example:
    F = v.autoregressive_fit(Integer(100))###line 532:_sage_    >>> F = v.autoregressive_fit(100)
Exception raised:
    Traceback (most recent call last):
      File "/scratch/mabshoff/release-cycle/sage-3.2.2.alpha1/local/bin/ncadoctest.py", line 1231, in run_one_test
        self.run_one_example(test, example, filename, compileflags)
      File "/scratch/mabshoff/release-cycle/sage-3.2.2.alpha1/local/bin/sagedoctest.py", line 38, in run_one_example
        OrigDocTestRunner.run_one_example(self, test, example, filename, compileflags)
      File "/scratch/mabshoff/release-cycle/sage-3.2.2.alpha1/local/bin/ncadoctest.py", line 1172, in run_one_example
        compileflags, 1) in test.globs
      File "<doctest __main__.example_15[4]>", line 1, in <module>
        F = v.autoregressive_fit(Integer(100))###line 532:_sage_    >>> F = v.autoregressive_fit(100)
      File "time_series.pyx", line 557, in sage.finance.time_series.TimeSeries.autoregressive_fit (sage/finance/time_series.c:4388)
      File "time_series.pyx", line 2378, in sage.finance.time_series.autoregressive_fit (sage/finance/time_series.c:12648)
      File "time_series.pyx", line 1865, in sage.finance.time_series.TimeSeries.numpy (sage/finance/time_series.c:10977)
    MemoryError
**********************************************************************
<SNIP>

Cheers,

Michael

12/09/2008 12:45:03 AM changed by jason

  • attachment timeseries-64bit.patch added.

apply on top of previous patches

12/09/2008 01:02:28 AM changed by mabshoff

  • summary changed from [with patch, needs work] convert RDF and CDF vectors to use numpy to [with patch, positive review] convert RDF and CDF vectors to use numpy.

Positive review for the last patch.

Merge

  • vector-RDF-CDF-numpy-sage-3.2.1.patch
  • real_roots-import.patch
  • vector_fft.patch
  • timeseries-64bit.patch

Cheers,

Michael

12/09/2008 01:40:51 AM changed by mabshoff

  • status changed from assigned to closed.
  • resolution set to fixed.

Merged the above four patches in Sage 3.2.2.alpha1