AddressTree クラス
住所階層木構造を表すクラスです。
jageocoder では、住所は表形式ではなく、 id=0 を持つ根 (root) ノードの下に都道府県を表すノードがあり、 そのさらに下に市区町村を表すノードがあり、という 階層木構造を利用して管理しています。
それぞれのノードは jageocoder.node.AddressNode
クラスの
オブジェクトです。
また、このクラスはデータベース接続セッションも管理しています。 言い換えれば、複数の AddressTree オブジェクトを生成すれば、 複数のデータベースを利用するコードを書くこともできます。
- class jageocoder.tree.AddressTree(db_dir: Optional[PathLike] = None, mode: str = 'a', debug: Optional[bool] = None)
The address-tree structure.
- engine
The database engine which is used to connect to the database.
- Type:
sqlalchemy.engine.Engine
- conn
The connection object which is used to communicate witht the database.
- Type:
sqlalchemy.engine.Connection
- session
The session object used for a series of database operations.
- Type:
sqlalchemy.orm.Session
- root
The root node of the tree.
- Type:
- trie
The TRIE index of the tree.
- Type:
AddressTrie
- __init__(db_dir: Optional[PathLike] = None, mode: str = 'a', debug: Optional[bool] = None)
The initializer
- パラメータ:
db_dir (os.PathLike, optional) -- The database directory. If omitted, the directory returned by get_db_dir() is used. 'address.db' and 'address.trie' are stored under this directory.
mode (str, optional (default='a')) --
Specifies the mode for opening the database.
In the case of 'a', if the database already exists, use it. Otherwize create a new one.
In the case of 'w', if the database already exists, delete it first. Then create a new one.
In the case of 'r', if the database already exists, use it. Otherwise raise a JageocoderError exception.
debug (bool, optional (default=False)) -- Debugging flag. If set to True, write debugging messages. If omitted, refer 'JAGEOCODER_DEBUG' environment variable, or False if the environment variable is also undefined.
- add_address(address_names: List[str], do_update: bool = False, cache: Optional[LRU] = None, **kwargs) AddressNode
Create a new AddressNode and add to the tree.
- パラメータ:
address_names (list of str) -- A list of the address element names. For example, ["東京都","新宿区","西新宿", "2丁目"]
do_update (bool) -- When an address with the same name already exists, update it with the value of kwargs if 'do_update' is true, otherwise do nothing.
cache (LRU, optional) -- A dict object to use as a cache for improving performance, whose keys are the address notation from the prefecture level and whose values are the corresponding nodes. If not specified or None is given, do not use the cache.
**kwargs (properties of the new address node.) -- x : float. X coordinate or longitude in decimal degree y : float. Y coordinate or latitude in decimal degree level: int. Level of the node note : str. Note
- 戻り値:
The added node.
- 戻り値の型:
- check_line_format(args: List[str]) int
Receives split args from a line of comma-separated text representing a single address element, and returns the format ID.
- パラメータ:
- 戻り値:
The id of the identified format. 1. Address names without level, lon, lat 2. Address names without level, lon, lat, note 3. Address names without level, lon, lat, level without note 4. Address names without level, lon, lat, level, note
- 戻り値の型:
サンプル
>>> from jageocoder_converter import BaseConverter >>> base = BaseConverter() >>> base.check_line_format(['1;北海道','3;札幌市','4;中央区','141.34103','43.05513']) 1 >>> base.check_line_format(['1;北海道','3;札幌市','4;中央区','5;大通','6;西二十丁目','141.326249','43.057218','01101/ODN-20/']) 2 >>> base.check_line_format(['北海道','札幌市','中央区','大通','西二十丁目','141.326249','43.057218',6]) 3 >>> base.check_line_format(['北海道','札幌市','中央区','大通','西二十丁目','141.326249','43.057218',6,'01101/ODN-20/']) 4
- create_note_index_table() None
Collect notes from all address elements and create search table with index.
- get_address_node(id: int) AddressNode
Get address node from the tree by its id.
- パラメータ:
id (int) -- The node id.
- 戻り値:
Node with the specified ID.
- 戻り値の型:
- get_config(keys: Optional[Union[str, List[str]]] = None)
Get configurable parameter(s).
- パラメータ:
keys (str, List[str], optional) -- If a name of parameter is specified, return its value. Otherwise, a dict of specified key and its value pairs will be returned.
- 戻り値の型:
Any, or dict.
サンプル
>>> import jageocoder >>> jageocoder.init() >>> jageocoder.get_module_tree().get_config('aza_skip') 'off' >>> jageocoder.get_module_tree().get_config(['best_only', 'target_area']) {'best_only': True, 'target_area': []} >>> jageocoder.get_module_tree().get_config() {'debug': False, 'aza_skip': 'off', 'best_only': True, 'target_area': [], 'require_coordinates': False}
- get_node_by_id(node_id: int) AddressNode
Get the full node information by its id.
- パラメータ:
node_id (int) -- The target node id.
- 戻り値の型:
- get_root() AddressNode
Get the root-node of the tree. If not set yet, create and get the node from the database.
- 戻り値:
The root node object.
- 戻り値の型:
- get_trie_nodes() TrieNode
Get the TRIE node table.
メモ
Todo: If the trie index is not created, create.
- is_version_compatible() bool
Check if the dictionary version is compatible with the package.
- 戻り値:
True if compatible, otherwize False.
- 戻り値の型:
- parse_line_args(args: List[str], format_id: int) list
Receives split args from a line of comma-separated text representing a single address element, and returns a list of parsed attributes.
- パラメータ:
- 戻り値:
A list containing the following attributes. - Address names: list[str] - Longitude: float - Latitude: float - Level: int or None - note: str or None
- 戻り値の型:
サンプル
>>> from jageocoder_converter import BaseConverter >>> base = BaseConverter() >>> base.parse_line_args(['1;北海道','3;札幌市','4;中央区','141.34103','43.05513'], 1) [['1;北海道','3;札幌市','4;中央区'], 141.34103, 43.05513, None, None] >>> base.parse_line_args(['1;北海道','3;札幌市','4;中央区','5;大通','6;西二十丁目','141.326249','43.057218','01101/ODN-20/'], 2) [['1;北海道','3;札幌市','4;中央区','5;大通','6;西二十丁目'],141.326249,43.057218,None,'01101/ODN-20/'] >>> base.parse_line_args(['北海道','札幌市','中央区','大通','西二十丁目','141.326249','43.057218',6,'01101/ODN-20/'], 4) [['北海道','札幌市','中央区','大通','西二十丁目'],141.326249,43.057218,6,'01101/ODN-20/']
- read_file(path: PathLike, do_update: bool = False) None
Add AddressNodes from a text file. See 'data/test.txt' for the format of the text file.
- パラメータ:
path (os.PathLike) -- Text file path.
do_update (bool (default=False)) -- When an address with the same name already exists, update it with the value of the new data if 'do_update' is true, otherwise do nothing.
- read_stream(fp: TextIO, do_update: bool = False) None
Add AddressNodes to the tree from a stream.
- パラメータ:
fp (io.TextIO) -- Input text stream.
do_update (bool (default=False)) -- When an address with the same name already exists, update it with the value of the new data if 'do_update' is true, otherwise do nothing.
- searchNode(query: str) List[Result]
Searches for address nodes corresponding to an address notation and returns the matching substring and a list of nodes.
- パラメータ:
query (str) -- An address notation to be searched.
- 戻り値:
A list of AddressNode and matched substring pairs.
- 戻り値の型:
注釈
The search_by_trie function returns the standardized string as the match string. In contrast, the searchNode function returns the de-starndized string.
サンプル
>>> import jageocoder >>> jageocoder.init() >>> tree = jageocoder.get_module_tree() >>> tree.searchNode('多摩市落合1-15-2') [[[11460207:東京都(139.69178,35.68963)1(lasdec:130001/jisx0401:13)]>[12063502:多摩市(139.446366,35.636959)3(jisx0402:13224)]>[12065383:落合(139.427097,35.624877)5(None)]>[12065384:一丁目(139.427097,35.624877)6(None)]>[12065390:15番地(139.428969,35.625779)7(None)], '多摩市落合1-15-']]
- search_by_tree(address_names: List[str]) AddressNode
Get the corresponding node id from the list of address element names, recursively search for child nodes using the tree.
For example, ['東京都','新宿区','西新宿','二丁目'] will search the '東京都' node under the root node, search the '新宿区' node from the children of the '東京都' node. Repeat this process and return the '二丁目' node which is a child of '西新宿' node.
- search_by_trie(query: str) dict
Get the list of corresponding nodes using the TRIE index. Returns a list of address element nodes that match the query string in the longest part from the beginning.
For example, '中央区中央1丁目' will return the nodes corresponding to '千葉県千葉市中央区中央一丁目' and '神奈川県相模原市中央区中央一丁目'.
- パラメータ:
query (str) -- An address notation to be searched.
- 戻り値:
A dict object whose key is a node id
and whose value is a list of node and substrings
that match the query.
- search_nodes_by_codes(category: str, value: str) List[AddressNode]
Search nodes by category and value.
- パラメータ:
- 戻り値の型:
List[AddressNode]
- set_config(**kwargs)
Set configuration parameters.
注釈
The possible keywords and their meanings are as follows.
- best_only: bool (default = True)
If set to False, returns all search result candidates whose prefix matches.
- aza_skip: bool, None (default = False)
Specifies how to skip aza-names while searching nodes. - If None, make the decision automatically - If False, do not skip - If True, always skip
- require_coordinates: bool (default = True)
If set to False, nodes without coordinates are also included in the search.
- target_area: List[str] (Default = [])
Specify the areas to be searched. The area can be specified by the list of name of the node (such as prefecture name or city name), or JIS code.