AddressTree クラス

住所階層木構造を表すクラスです。

jageocoder では、住所は表形式ではなく、 id=0 を持つ根 (root) ノードの下に都道府県を表すノードがあり、 そのさらに下に市区町村を表すノードがあり、という 階層木構造を利用して管理しています。

それぞれのノードは jageocoder.node.AddressNode クラスの オブジェクトです。

また、このクラスはデータベース接続セッションも管理しています。 言い換えれば、複数の AddressTree オブジェクトを生成すれば、 複数のデータベースを利用するコードを書くこともできます。

class jageocoder.tree.AddressTree(db_dir: Optional[PathLike] = None, mode: str = 'a', debug: Optional[bool] = None)

The address-tree structure.

db_path

Path to the sqlite3 database file.

Type:

str

dsn

RFC-1738 based database-url, so called "data source name".

Type:

str

trie_path

Path to the TRIE index file.

Type:

str

engine

The database engine which is used to connect to the database.

Type:

sqlalchemy.engine.Engine

conn

The connection object which is used to communicate witht the database.

Type:

sqlalchemy.engine.Connection

session

The session object used for a series of database operations.

Type:

sqlalchemy.orm.Session

root

The root node of the tree.

Type:

AddressNode

trie

The TRIE index of the tree.

Type:

AddressTrie

mode

The mode in which this tree was opened.

Type:

str

config

Settings the search method in this tree.

Type:

dict

__init__(db_dir: Optional[PathLike] = None, mode: str = 'a', debug: Optional[bool] = None)

The initializer

パラメータ:
  • db_dir (os.PathLike, optional) -- The database directory. If omitted, the directory returned by get_db_dir() is used. 'address.db' and 'address.trie' are stored under this directory.

  • mode (str, optional (default='a')) --

    Specifies the mode for opening the database.

    • In the case of 'a', if the database already exists, use it. Otherwize create a new one.

    • In the case of 'w', if the database already exists, delete it first. Then create a new one.

    • In the case of 'r', if the database already exists, use it. Otherwise raise a JageocoderError exception.

  • debug (bool, optional (default=False)) -- Debugging flag. If set to True, write debugging messages. If omitted, refer 'JAGEOCODER_DEBUG' environment variable, or False if the environment variable is also undefined.

add_address(address_names: List[str], do_update: bool = False, cache: Optional[LRU] = None, **kwargs) AddressNode

Create a new AddressNode and add to the tree.

パラメータ:
  • address_names (list of str) -- A list of the address element names. For example, ["東京都","新宿区","西新宿", "2丁目"]

  • do_update (bool) -- When an address with the same name already exists, update it with the value of kwargs if 'do_update' is true, otherwise do nothing.

  • cache (LRU, optional) -- A dict object to use as a cache for improving performance, whose keys are the address notation from the prefecture level and whose values are the corresponding nodes. If not specified or None is given, do not use the cache.

  • **kwargs (properties of the new address node.) -- x : float. X coordinate or longitude in decimal degree y : float. Y coordinate or latitude in decimal degree level: int. Level of the node note : str. Note

戻り値:

The added node.

戻り値の型:

AddressNode

check_line_format(args: List[str]) int

Receives split args from a line of comma-separated text representing a single address element, and returns the format ID.

パラメータ:

args (list[str]) --

戻り値:

The id of the identified format. 1. Address names without level, lon, lat 2. Address names without level, lon, lat, note 3. Address names without level, lon, lat, level without note 4. Address names without level, lon, lat, level, note

戻り値の型:

int

サンプル

>>> from jageocoder_converter import BaseConverter
>>> base = BaseConverter()
>>> base.check_line_format(['1;北海道','3;札幌市','4;中央区','141.34103','43.05513'])
1
>>> base.check_line_format(['1;北海道','3;札幌市','4;中央区','5;大通','6;西二十丁目','141.326249','43.057218','01101/ODN-20/'])
2
>>> base.check_line_format(['北海道','札幌市','中央区','大通','西二十丁目','141.326249','43.057218',6])
3
>>> base.check_line_format(['北海道','札幌市','中央区','大通','西二十丁目','141.326249','43.057218',6,'01101/ODN-20/'])
4
close() NoReturn
create_note_index_table() None

Collect notes from all address elements and create search table with index.

create_trie_index() None

Create the TRIE index from the tree.

get_address_node(id: int) AddressNode

Get address node from the tree by its id.

パラメータ:

id (int) -- The node id.

戻り値:

Node with the specified ID.

戻り値の型:

AddressNode

get_cache_info() dict
get_config(keys: Optional[Union[str, List[str]]] = None)

Get configurable parameter(s).

パラメータ:

keys (str, List[str], optional) -- If a name of parameter is specified, return its value. Otherwise, a dict of specified key and its value pairs will be returned.

戻り値の型:

Any, or dict.

サンプル

>>> import jageocoder
>>> jageocoder.init()
>>> jageocoder.get_module_tree().get_config('aza_skip')
'off'
>>> jageocoder.get_module_tree().get_config(['best_only', 'target_area'])
{'best_only': True, 'target_area': []}
>>> jageocoder.get_module_tree().get_config()
{'debug': False, 'aza_skip': 'off', 'best_only': True, 'target_area': [], 'require_coordinates': False}
get_node_by_id(node_id: int) AddressNode

Get the full node information by its id.

パラメータ:

node_id (int) -- The target node id.

戻り値の型:

AddressNode

get_node_fullname(node: Union[AddressNode, int]) List[str]
get_root() AddressNode

Get the root-node of the tree. If not set yet, create and get the node from the database.

戻り値:

The root node object.

戻り値の型:

AddressNode

get_trie_nodes() TrieNode

Get the TRIE node table.

メモ

  • Todo: If the trie index is not created, create.

get_version() str

Get the version of the tree file.

戻り値:

The version string.

戻り値の型:

str

is_version_compatible() bool

Check if the dictionary version is compatible with the package.

戻り値:

True if compatible, otherwize False.

戻り値の型:

bool

parse_line_args(args: List[str], format_id: int) list

Receives split args from a line of comma-separated text representing a single address element, and returns a list of parsed attributes.

パラメータ:
  • args (list[str]) -- List of split args in a line

  • format_id (int) -- The id of the line format identfied by check_line_format

戻り値:

A list containing the following attributes. - Address names: list[str] - Longitude: float - Latitude: float - Level: int or None - note: str or None

戻り値の型:

list

サンプル

>>> from jageocoder_converter import BaseConverter
>>> base = BaseConverter()
>>> base.parse_line_args(['1;北海道','3;札幌市','4;中央区','141.34103','43.05513'], 1)
[['1;北海道','3;札幌市','4;中央区'], 141.34103, 43.05513, None, None]
>>> base.parse_line_args(['1;北海道','3;札幌市','4;中央区','5;大通','6;西二十丁目','141.326249','43.057218','01101/ODN-20/'], 2)
[['1;北海道','3;札幌市','4;中央区','5;大通','6;西二十丁目'],141.326249,43.057218,None,'01101/ODN-20/']
>>> base.parse_line_args(['北海道','札幌市','中央区','大通','西二十丁目','141.326249','43.057218',6,'01101/ODN-20/'], 4)
[['北海道','札幌市','中央区','大通','西二十丁目'],141.326249,43.057218,6,'01101/ODN-20/']
read_file(path: PathLike, do_update: bool = False) None

Add AddressNodes from a text file. See 'data/test.txt' for the format of the text file.

パラメータ:
  • path (os.PathLike) -- Text file path.

  • do_update (bool (default=False)) -- When an address with the same name already exists, update it with the value of the new data if 'do_update' is true, otherwise do nothing.

read_stream(fp: TextIO, do_update: bool = False) None

Add AddressNodes to the tree from a stream.

パラメータ:
  • fp (io.TextIO) -- Input text stream.

  • do_update (bool (default=False)) -- When an address with the same name already exists, update it with the value of the new data if 'do_update' is true, otherwise do nothing.

save_all() None

Save all AddressNode in the tree to the database.

search(query: str, **kwargs) list
searchNode(query: str) List[Result]

Searches for address nodes corresponding to an address notation and returns the matching substring and a list of nodes.

パラメータ:

query (str) -- An address notation to be searched.

戻り値:

A list of AddressNode and matched substring pairs.

戻り値の型:

list

注釈

The search_by_trie function returns the standardized string as the match string. In contrast, the searchNode function returns the de-starndized string.

サンプル

>>> import jageocoder
>>> jageocoder.init()
>>> tree = jageocoder.get_module_tree()
>>> tree.searchNode('多摩市落合1-15-2')
[[[11460207:東京都(139.69178,35.68963)1(lasdec:130001/jisx0401:13)]>[12063502:多摩市(139.446366,35.636959)3(jisx0402:13224)]>[12065383:落合(139.427097,35.624877)5(None)]>[12065384:一丁目(139.427097,35.624877)6(None)]>[12065390:15番地(139.428969,35.625779)7(None)], '多摩市落合1-15-']]
search_by_tree(address_names: List[str]) AddressNode

Get the corresponding node id from the list of address element names, recursively search for child nodes using the tree.

For example, ['東京都','新宿区','西新宿','二丁目'] will search the '東京都' node under the root node, search the '新宿区' node from the children of the '東京都' node. Repeat this process and return the '二丁目' node which is a child of '西新宿' node.

パラメータ:

address_names (list of str) -- A list of address element names to be searched.

戻り値:

The node matched last.

戻り値の型:

AddressNode

search_by_trie(query: str) dict

Get the list of corresponding nodes using the TRIE index. Returns a list of address element nodes that match the query string in the longest part from the beginning.

For example, '中央区中央1丁目' will return the nodes corresponding to '千葉県千葉市中央区中央一丁目' and '神奈川県相模原市中央区中央一丁目'.

パラメータ:

query (str) -- An address notation to be searched.

戻り値:

  • A dict object whose key is a node id

  • and whose value is a list of node and substrings

  • that match the query.

search_nodes_by_codes(category: str, value: str) List[AddressNode]

Search nodes by category and value.

パラメータ:
  • category (str) -- Category name such as 'jisx0402' or 'postcode'.

  • value (str) -- Target value.

  • levels (List[int], optional) -- The address levels of target nodes.

戻り値の型:

List[AddressNode]

set_config(**kwargs)

Set configuration parameters.

注釈

The possible keywords and their meanings are as follows.

  • best_only: bool (default = True)

    If set to False, returns all search result candidates whose prefix matches.

  • aza_skip: bool, None (default = False)

    Specifies how to skip aza-names while searching nodes. - If None, make the decision automatically - If False, do not skip - If True, always skip

  • require_coordinates: bool (default = True)

    If set to False, nodes without coordinates are also included in the search.

  • target_area: List[str] (Default = [])

    Specify the areas to be searched. The area can be specified by the list of name of the node (such as prefecture name or city name), or JIS code.

update_name_index() int

Update name_index field using the standardizing logic of the current version.

注釈

This method also updates the version information of the dictionary.

戻り値:

Number of records updated.

戻り値の型:

int

validate_config(key: str, value: Any) None

Validate configuration key and parameters.

パラメータ:
  • key (str) -- The name of the parameter.

  • value (str, int, bool, None) -- The value to be set to the parameter.

メモ

If the key-value pair is not valid, raise RuntimeError.