pd.to_datetime() Time Processing Function

Official Documentation

https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html

pd.to_datetime()

pandas.to_datetime(arg, errors=’raise’, dayfirst=False, yearfirst=False, utc=None, format=None, exact=True, unit=None, infer_datetime_format=False, origin=’unix’, cache=True)

paramters

(1) arg: int, float, str, datetime, list, tuple, 1-d array, Series, DataFrame/dict-like
The object to convert to a datetime.

(2) errors: {‘ignore’, ‘raise’, ‘coerce’}, default ‘raise’

  • If ‘raise’ , then invalid parsing will raise an exception.
  • If ‘coerce’ , then invalid parsing will be set as NaT.
  • If ‘ignore’ , then invalid parsing will return the input.

(3) dayfirst: bool, default False
Specify a date parse order if arg is str or is list-like. If True, parses dates with the day first, e.g. “10/11/12” is parsed as 2012-11-10.

(4) yearfirst: bool, default False
Specify a date parse order if arg is str or is list-like.

  • If True parses dates with the year first, e.g. “10/11/12” is parsed as 2010-11-12 .
  • If both dayfirst and yearfirst are True , yearfirst is preceded (same as dateutil ).

(5) utc: bool, default None
Control timezone-related parsing, localization and conversion.

  • If True , the function always returns a timezone-aware UTC-localized Timestamp, Series or DatetimeIndex. To do this, timezone-naive inputs are localized as UTC, while timezone-aware inputs are converted to UTC.
  • If False (default), inputs will not be coerced to UTC. Timezone-naive inputs will remain naive, while timezone-aware ones will keep their time offsets. Limitations exist for mixed offsets (typically, daylight savings), see Examples section for details.

See also: pandas general documentation about timezone conversion and localization.

(6) format: str, default None
The strftime to parse time, e.g. “%d/%m/%Y” . Note that “%f” will parse all the way up to nanoseconds. See strftime documentation for more information on choices.

(7) exact: bool, default True
Control how format is used:

  • If True , require an exact format match.
  • If False , allow the format to match anywhere in the target string.

(8) unit: str, default ‘ns’
The unit of the arg (D,s,ms,us,ns) denote the unit, which is an integer or float number. This will be based off the origin. Example, with unit=’ms’ and origin=’unix’ (the default), this would calculate the number of milliseconds to the unix epoch start.

(9) infer_datetime_format: bool, default False
If True and no format is given, attempt to infer the format of the datetime strings based on the first non-NaN element, and if it can be inferred, switch to a faster method of parsing them. In some cases this can increase the parsing speed by ~5-10x.

(10) origin: scalar, default ‘unix’
Define the reference date. The numeric values would be parsed as number of units (defined by unit) since this reference date.

  • If ‘unix’ (or POSIX) time; origin is set to 1970-01-01.
  • If ‘julian’ , unit must be ‘D’ , and origin is set to beginning of Julian Calendar. Julian day number 0 is assigned to the day starting at noon on January 1, 4713 BC.
  • If Timestamp convertible, origin is set to Timestamp identified by origin.

(11) cache: bool, default True
If True , use a cache of unique, converted dates to apply the datetime conversion. May produce significant speed-up when parsing duplicate date strings, especially ones with timezone offsets. The cache is only used when there are at least 50 values. The presence of out-of-bounds values will render the cache unusable and may slow down parsing.


pd.to_datetime() Time Processing Function
https://www.hardyhu.cn/2022/02/25/pd-to-datetime-Time-Processing-Function/
Author
John Doe
Posted on
February 25, 2022
Licensed under