Commit graph

48 commits

Author SHA1 Message Date
Serhiy Storchaka 96aeaec647
bpo-36793: Remove unneeded __str__ definitions. (GH-13081)
Classes that define __str__ the same as __repr__ can
just inherit it from object.
2019-05-06 22:29:40 +03:00
Serhiy Storchaka 3557b05c5a bpo-31690: Allow the inline flags "a", "L", and "u" to be used as group flags for RE. (#3885) 2017-10-24 23:31:42 +03:00
Serhiy Storchaka 0b5e61ddca bpo-30397: Add re.Pattern and re.Match. (#1646) 2017-10-04 20:09:49 +03:00
Serhiy Storchaka 12d6b5d156 bpo-30398: Add a docstring for re.error. (#1647)
Also document that some attributes may be None.
2017-05-27 16:12:48 +03:00
Serhiy Storchaka 898ff03e1e bpo-30215: Make re.compile() locale agnostic. (#1361)
Compiled regular expression objects with the re.LOCALE flag no longer
depend on the locale at compile time.  Only the locale at matching
time affects the result of matching.
2017-05-05 08:53:40 +03:00
Serhiy Storchaka 46ba6c8563 Issue #22831: Use "with" to avoid possible fd leaks. 2015-04-04 11:01:02 +03:00
Serhiy Storchaka 720b8c9dd7 Removed unused function linecol() (added in issue #22578 by mistake). 2014-12-01 11:25:21 +02:00
Serhiy Storchaka ad446d57a9 Issue #22578: Added attributes to the re.error class. 2014-11-10 13:49:00 +02:00
Serhiy Storchaka c7f7d3897e Issue #22434: Constants in sre_constants are now named constants (enum-like). 2014-11-09 20:48:36 +02:00
Serhiy Storchaka 4b8f8949b4 Issue #17381: Fixed handling of case-insensitive ranges in regular expressions.
Added new opcode RANGE_IGNORE.
2014-10-31 12:36:56 +02:00
Serhiy Storchaka 9baa5b2de2 Issue #22437: Number of capturing groups in regular expression is no longer
limited by 100.
2014-09-29 22:49:23 +03:00
Serhiy Storchaka 5c24d0e504 Issue #13592: Improved the repr for regular expression pattern objects.
Based on patch by Hugo Lopes Tavares.
2013-11-23 22:42:43 +02:00
Serhiy Storchaka 9acb9bc659 Fix issue #13169: Reimport MAXREPEAT into sre_constants.py. 2013-02-18 11:14:21 +02:00
Serhiy Storchaka 70ca0210e8 Issue #13169: The maximal repetition number in a regular expression has been
increased from 65534 to 2147483647 (on 32-bit platform) or 4294967294 (on
64-bit).
2013-02-16 16:47:47 +02:00
Serhiy Storchaka 385ecd84c8 Fix generating of sre_constants.h on Python 3. 2013-01-24 13:43:02 +02:00
Antoine Pitrou fd036451bf #2834: Change re module semantics, so that str and bytes mixing is forbidden,
and str (unicode) patterns get full unicode matching by default. The re.ASCII
flag is also introduced to ask for ASCII matching instead.
2008-08-19 17:56:33 +00:00
Guido van Rossum be19ed77dd Fix most trivially-findable print statements.
There's one major and one minor category still unfixed:
doctests are the major category (and I hope to be able to augment the
refactoring tool to refactor bona fide doctests soon);
other code generating print statements in strings is the minor category.

(Oh, and I don't know if the compiler package works.)
2007-02-09 05:37:30 +00:00
Barry Warsaw 8bee76106e PEP 292 classes Template and SafeTemplate are added to the string module.
This patch includes test cases and documentation updates, as well as NEWS file
updates.

This patch also updates the sre modules so that they don't import the string
module, breaking direct circular imports.
2004-08-25 02:22:30 +00:00
Gustavo Niemeyer ad3fc44ccb Implemented non-recursive SRE matching. 2003-10-17 22:13:16 +00:00
Raymond Hettinger 6b59f5f3fd Let library modules use the new keyword arguments for list.sort(). 2003-10-16 05:53:16 +00:00
Martin v. Löwis 78e2f06cc6 Fully support 32-bit codes. Enable BIGCHARSET in UCS-4 builds. 2003-04-19 12:56:08 +00:00
Guido van Rossum 41c99e7f96 SF patch #720991 by Gary Herron:
A small fix for bug #545855 and Greg Chapman's
addition of op code SRE_OP_MIN_REPEAT_ONE for
eliminating recursion on simple uses of pattern '*?' on a
long string.
2003-04-14 17:59:34 +00:00
Fred Drake b8f2274985 Added docstrings by Neal Norwitz. This closes SF bug #450980. 2001-09-04 19:10:20 +00:00
Fredrik Lundh 19af43d78a added martin's BIGCHARSET patch to SRE 2.1.1. martin reports 2x
speedups for certain unicode character ranges.
2001-07-02 16:58:38 +00:00
Fredrik Lundh b25e1ad253 sre 2.1b2 update:
- take locale into account for word boundary anchors (#410271)
- restored 2.0's *? behaviour (#233283, #408936 and others)
- speed up re.sub/re.subn
2001-03-22 15:50:10 +00:00
Fredrik Lundh f2989b22ff - restored 1.5.2 compatibility (sorry, eric)
- removed __all__ cruft from internal modules (sorry, skip)
- don't assume ASCII for string escapes (sorry, per)
2001-02-18 12:05:16 +00:00
Skip Montanaro 1ca2ed35e0 removed __all__ - should probably rename makedict to _makedict unless it is
to be exported
2001-02-18 03:13:08 +00:00
Skip Montanaro 0de65807e6 bunch more __all__ lists
also modified check_all function to suppress all warnings since they aren't
relevant to what this test is doing (allows quiet checking of regsub, for
instance)
2001-02-15 22:15:14 +00:00
Eric S. Raymond be18552874 String method conversion. 2001-02-09 10:30:23 +00:00
Fredrik Lundh b35ffc0417 added "magic" number to the _sre module, to avoid weird errors caused
by compiler/engine mismatches
2001-01-15 12:46:09 +00:00
Fredrik Lundh 770617b23e SRE fixes for 2.1 alpha:
-- added some more docstrings
-- fixed typo in scanner class (#125531)
-- the multiline flag (?m) should't affect the \Z operator (#127259)
-- fixed non-greedy backtracking bug (#123769, #127259)
-- added sre.DEBUG flag (currently dumps the parsed pattern structure)
-- fixed a couple of glitches in groupdict (the #126587 memory leak
   had already been fixed by AMK)
2001-01-14 15:06:11 +00:00
Fredrik Lundh 13ac9926ac Fixed too ambitious "nothing to repeat" check. Closes bug #114033. 2000-10-07 17:38:23 +00:00
Fredrik Lundh e186983842 final 0.9.8 updates:
-- added REPEAT_ONE operator
-- added ANY_ALL operator (used to represent "(?s).")
2000-08-01 22:47:49 +00:00
Fredrik Lundh 29c4ba9ada SRE 0.9.8: passes the entire test suite
-- reverted REPEAT operator to use "repeat context" strategy
   (from 0.8.X), but done right this time.
-- got rid of backtracking stack; use nested SRE_MATCH calls
   instead (should probably put it back again in 0.9.9 ;-)
-- properly reset state in scanner mode
-- don't use aggressive inlining by default
2000-08-01 18:20:07 +00:00
Fredrik Lundh 8a3ebf8ca8 -- SRE 0.9.6 sync. this includes:
+ added "regs" attribute
 + fixed "pos" and "endpos" attributes
 + reset "lastindex" and "lastgroup" in scanner methods
 + removed (?P#id) syntax; the "lastindex" and "lastgroup"
   attributes are now always set
 + removed string module dependencies in sre_parse
 + better debugging support in sre_parse
 + various tweaks to build under 1.5.2
2000-07-23 21:46:17 +00:00
Thomas Wouters 7e47402264 Spelling fixes supplied by Rob W. W. Hooft. All these are fixes in either
comments, docstrings or error messages. I fixed two minor things in
test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't").

There is a minor style issue involved: Guido seems to have preferred English
grammar (behaviour, honour) in a couple places. This patch changes that to
American, which is the more prominent style in the source. I prefer English
myself, so if English is preferred, I'd be happy to supply a patch myself ;)
2000-07-16 12:04:32 +00:00
Fredrik Lundh 72b82ba16d - fixed grouping error bug
- changed "group" operator to "groupref"
2000-07-03 21:31:48 +00:00
Fredrik Lundh 7cafe4d7e4 - actually enabled charset anchors in the engine (still not
used by the code generator)

- changed max repeat value in engine (to match earlier array fix)

- added experimental "which part matched?" mechanism to sre; see
  http://hem.passagen.se/eff/2000_07_01_bot-archive.htm#416954
  or python-dev for details.
2000-07-02 17:33:27 +00:00
Fredrik Lundh 3562f11764 -- use charset bitmaps where appropriate. this gives a 5-10%
speedup for some tests, including the python tokenizer.

-- added support for an optional charset anchor to the engine
   (currently unused by the code generator).

-- removed workaround for array module bug.
2000-07-02 12:00:07 +00:00
Fredrik Lundh 22d2546520 today's SRE update:
-- changed 1.6 to 2.0 in the file headers

-- fixed ISALNUM macro for the unicode locale.  this
   solution isn't perfect, but the best I can do with
   Python's current unicode database.
2000-07-01 17:50:59 +00:00
Fredrik Lundh 43b3b49b5a - fixed lookahead assertions (#10, #11, #12)
- untabified sre_constants.py
2000-06-30 10:41:31 +00:00
Fredrik Lundh 8094611eb8 - fixed another split problem
(those semantics are weird...)

- got rid of $Id$'s (for the moment, at least).  in other
  words, there should be no more "empty" checkins.

- internal: some minor cleanups.
2000-06-29 18:03:25 +00:00
Fredrik Lundh 6c68dc7b1a - removed "alpha only" licensing restriction
- removed some hacks that worked around 1.6 alpha bugs
- removed bogus test code from sre_parse
2000-06-29 10:34:56 +00:00
Fredrik Lundh 436c3d58a2 towards 1.6b1 2000-06-29 08:58:44 +00:00
Jeremy Hylton b1aa19515f Fredrik Lundh: here's the 96.6% version of SRE 2000-06-01 17:39:12 +00:00
Guido van Rossum b81e70ebdb Fredrik Lundh: new snapshot. Mostly reindented.
This one should work with unicode expressions, and compile
a bit more silently.
2000-04-10 17:10:48 +00:00
Andrew M. Kuchling e3ba931aa4 This patch looks large, but it just deletes the ^M characters and
untabifies the files.  No actual code changes were made.
2000-04-02 05:22:30 +00:00
Guido van Rossum 7627c0de69 Added Fredrik Lundh's sre module and its supporting cast.
NOTE: THIS IS VERY ROUGH ALPHA CODE!
2000-03-31 14:58:54 +00:00